Commit Graph

1469 Commits

Author SHA1 Message Date
Michael Kruse ca5f05d2df [Polly][test] Add dependency to count.
Polly does not use the count program itself, but somewhere in lit it is
expected to exists. Otherwise, the following error occurs:

    llvm-lit: llvm-project/llvm/utils/lit/lit/llvm/subst.py:133: fatal: Did not find count in ./bin
2021-08-28 22:50:07 -05:00
Michael Kruse ffa39b4582 [Polly] Fix dumpfunction.ll test. 2021-08-28 22:43:07 -05:00
Michael Kruse e4f3f2c0c5 [Polly] Don't prune non-external function itself from dump. 2021-08-28 17:06:53 -05:00
Michael Kruse 1537563104 [Polly][test] Add missing %loadPolly.
This fixes check-polly when using the -load mechanism,
i.e. LLVM_POLLY_LINK_INTO_TOOLS=OFF.
2021-08-24 13:47:25 -05:00
Michael Kruse 955b91c19c [Polly] Never consider non-SCoP blocks as error blocks.
Code outside the SCoP will be executed recardless of the code versioning
runtime check introduced by CodeGeneration. Assumption made based on
that these are never executed in Polly-optimized code does not hold.

This fixes the miscompilation of MultiSource/Applications/lambda-0.1.3
2021-08-23 01:04:01 -05:00
Michael Kruse 9cfab5e249 [Polly] Add support for -polly-dump-before/after with NPM.
The new pass manager does not allow adding module passes at the
-polly-position=before-vectorizer extension point. Introduce a
DumpFunctionPass that dumps only current function. In contrast to the
legacy pass manager's -polly-dump-before, each function will be dumped
into its own file. -polly-dump-before-file is still not supported.

The DumpFunctionPass uses llvm::CloneModule to copy the current function
into a new module and then write it into a file.
2021-08-22 20:43:35 -05:00
Eli Friedman 3f2828dc28 [polly] Fix up regression test config with current features.
Primarily, configure substitutions so we can copy-paste the "RUN" line
of failed tests without worrying about the paths.
2021-07-30 13:44:48 -07:00
Riccardo Mori ec3da1a43f Update isl to isl-0.24-69-g54aac5ac
This is needed for having the functions isl_{set,map}_n_basic_{set,map}
exported to the C++ interface.
Some tests have been modified to reflect the isl changes.
2021-07-27 17:38:12 +02:00
Michael Kruse 84046ebd95 [Polly] Fix test after D104732.
The SCEV analysis has been improved to identify a write access as a MustWrite.
2021-06-23 14:59:53 -05:00
Bjorn Pettersson 6aac2773d8 [polly][GPGPU] Fixup related to overloading exponent type in llvm.powi
Commit 4c7f820b2b changed the llvm.powi intrinsic to support
different 'int' sizes for the exponent. That happened to break
the IntrinsicToLibdeviceFunc mapping in PPCGCodeGeneration, which
obviously should have been updated as part of commit 4c7f820b2b
(https://reviews.llvm.org/D99439).

The shortcoming was found by buildbots that use
   -DPOLLY_ENABLE_GPGPU_CODEGEN=ON

This patch should fixup the problem.
2021-06-18 08:59:06 +02:00
Michael Kruse a56bd7dec8 [Polly][Matmul] Re-pack A in every iteration.
Packed_A must be copied repeatedly, not just for the first iteration of
the outer tile.

This fixes llvm.org/PR50557
2021-06-09 15:19:52 -05:00
Eli Friedman fd229caa01 [polly] Fix SCEVLoopAddRecRewriter to avoid invalid AddRecs.
When we're remapping an AddRec, the AddRec constructed by a partial
rewrite might not make sense.  This triggers an assertion complaining
it's not loop-invariant.

Instead of constructing the partially rewritten AddRec, just skip
straight to calling evaluateAtIteration.

Testcase was automatically reduced using llvm-reduce, so it's a little
messy, but hopefully makes sense.

Differential Revision: https://reviews.llvm.org/D102959
2021-06-01 09:51:05 -07:00
serge-sans-paille 4ab3041acb Revert "[NFC] remove explicit default value for strboolattr attribute in tests"
This reverts commit bda6e5bee0.

See https://lab.llvm.org/buildbot/#/builders/109/builds/15424 for instance
2021-05-24 19:43:40 +02:00
serge-sans-paille bda6e5bee0 [NFC] remove explicit default value for strboolattr attribute in tests
Since d6de1e1a71, no attributes is quivalent to
setting attribute to false.

This is a preliminary commit for https://reviews.llvm.org/D99080
2021-05-24 19:31:04 +02:00
Michael Kruse ad568f4286 [Polly] Add support for -polly-dump-after(-file) with the NPM.
For the same reason as with -polly-dump-before, it is only supported
with -polly-position=early.
2021-05-17 22:20:47 -05:00
Michael Kruse 29bef8e4e3 [Polly] Add support for -polly-dump-before(-file) with the NPM.
Only supported with -polly-position=early. Unfortunately, the
extension point callpack for VectorizerStart only passes a
FunctionPassManager, making it impossible to add a module pass.
2021-05-17 20:58:37 -05:00
Michael Kruse 5aafcb2b44 [Polly] Add support for -polly-position=early with the NPM.
This required support for the canonicalization passes, inlcuding
porting RewriteByReferenceParams to the NPM.

For some reason, the legacy pass pipeline with -polly-position=early did
not run the CodePreparation pass. This was fixed as well.
2021-05-14 12:55:03 -05:00
Michael Kruse 286677870b [Polly][ManualOpt] Match interpretation of unroll metadata to LoopUnrolls's.
We previously had a different interpretation of unroll transformation
attributes than how LoopUnroll interpreted it. In particular,
llvm.loop.unroll.enable was needed explicitly to enable it and disabling
metadata was ignored.
Additionally, it required that either full unrolling or an unroll factor
to be specified or fail otherwise. An unroll factor is still required,
but the transformation is ignored with the hope that LoopUnroll is going
to apply the unrolling, since Polly currently does not implement an
heuristic.

Fixes llvm.org/PR50109
2021-04-24 04:30:19 -05:00
Roman Lebedev 2aff4f7f57
[polly] Fix check-polly after SCEVExpander PtrToInt fixes 2021-04-19 19:10:55 +03:00
Michael Kruse 8796451d6e [Polly] Port DeadCodeElim to the NewPM. 2021-03-24 01:01:29 -05:00
Michael Kruse f51427afb5 [Polly][Unroll] Fix unroll_double test.
We enumerated the cross product Domain x Scatter, but sorted only be the
scatter key. In case there are are multiple statement instances per
scatter value, the order between statement instances of the same loop
iteration was undefined.

Propertly enumerate and sort only by the scatter value, and group the
domains using the scatter dimension again.

Thanks to Leonard Chan for the report.
2021-03-16 09:00:42 -05:00
Michael Kruse 3f170eb197 [Polly][Optimizer] Apply user-directed unrolling.
Make Polly look for unrolling metadata (https://llvm.org/docs/TransformMetadata.html#loop-unrolling) that is usually only interpreted by the LoopUnroll pass and apply it to the SCoP's schedule.

While not that useful by itself (there already is an unroll pass), it introduces mechanism to apply arbitrary loop transformation directives in arbitrary order to the schedule. Transformations are applied until no more directives are found. Since ISL's rescheduling would discard the manual transformations and it is assumed that when the user specifies the sequence of transformations, they do not want any other transformations to apply. Applying user-directed transformations can be controlled using the `-polly-pragma-based-opts` switch and is enabled by default.

This does not influence the SCoP detection heuristic. As a consequence, loop that do not fulfill SCoP requirements or the initial profitability heuristic will be ignored. `-polly-process-unprofitable` can be used to disable the latter.

Other than manually editing the IR, there is currently no way for the user to add loop transformations in an order other than the order in the default pipeline, or transformations other than the one supported by clang's LoopHint. See the `unroll_double.ll` test as example that clang currently is unable to emit. My own extension of `#pragma clang loop` allowing an arbitrary order and additional transformations is available here: https://github.com/meinersbur/llvm-project/tree/pragma-clang-loop. An effort to upstream this functionality as `#pragma clang transform` (because `#pragma clang loop` has an implicit transformation order defined by the loop pipeline) is D69088.

Additional transformations from my downstream pragma-clang-loop branch are tiling, interchange, reversal, unroll-and-jam, thread-parallelization and array packing. Unroll was chosen because it uses already-defined metadata and does not require correctness checks.

Reviewed By: sebastiankreutzer

Differential Revision: https://reviews.llvm.org/D97977
2021-03-15 13:05:39 -05:00
Roman Lebedev 78b8ce40ef
Reland [SCEV] Improve modelling for (null) pointer constants
This reverts commit 329aeb5db4,
and relands commit 61f006ac65.

This is a continuation of D89456.

As it was suggested there, now that SCEV models `PtrToInt`,
we can try to improve SCEV's pointer handling.
In particular, i believe, i will need this in the future
to further fix `SCEVAddExpr`operation type handling.

This removes special handling of `ConstantPointerNull`
from `ScalarEvolution::createSCEV()`, and add constant folding
into `ScalarEvolution::getPtrToIntExpr()`.
This way, `null` constants stay as such in SCEV's,
but gracefully become zero integers when asked.

Reviewed By: Meinersbur

Differential Revision: https://reviews.llvm.org/D98147
2021-03-13 16:05:34 +03:00
Roman Lebedev 329aeb5db4
Temporairly evert "[SCEV] Improve modelling for (null) pointer constants"
This appears to have broken ubsan bot:
https://lab.llvm.org/buildbot/#/builders/85/builds/3062
https://reviews.llvm.org/D98147#2623549

It looks like LSR needs some kind of a change around insertion point handling.
Reverting until i have a fix.

This reverts commit 61f006ac65.
2021-03-13 09:10:28 +03:00
Roman Lebedev 61f006ac65
[SCEV] Improve modelling for (null) pointer constants
This is a continuation of D89456.

As it was suggested there, now that SCEV models `PtrToInt`,
we can try to improve SCEV's pointer handling.
In particular, i believe, i will need this in the future
to further fix `SCEVAddExpr`operation type handling.

This removes special handling of `ConstantPointerNull`
from `ScalarEvolution::createSCEV()`, and add constant folding
into `ScalarEvolution::getPtrToIntExpr()`.
This way, `null` constants stay as such in SCEV's,
but gracefully become zero integers when asked.

Reviewed By: Meinersbur

Differential Revision: https://reviews.llvm.org/D98147
2021-03-12 22:11:58 +03:00
Roman Lebedev f449e5ef9b
[NFCI] Fix polly tests after b46c085d2b
That commit changed SCEVExpander to emit intrinsics instead of icmp+select,
but i forgot about polly, and i'm not sure if any bots complained.
2021-03-07 20:44:04 +03:00
Michael Kruse b85c98b4c5 [Polly][Codegen] Emit access group metadata.
Emit llvm.loop.parallel_accesses metadata instead of
llvm.mem.parallel_loop_access. The latter is deprecated because it
assumes that LoopIDs are persistent, which they are not.
We also emit parallel access metadata for all surrounding parallel
loops, not just the innermost parallel.
2021-03-04 03:58:03 -06:00
Michael Kruse 91c472c86c [Polly] Fix test after D96534. 2021-02-19 12:49:29 -06:00
Michael Kruse 089421ba9a [Polly] Test all optimization levels. 2021-02-14 00:31:10 -06:00
Michael Kruse 95ef556bd1 [Polly] Preserve DetectionContext references.
DetectionContext objects are stored as values in a DenseMap. When the
DenseMap reaches its maximum load factor, it is resized and all its
objects moved to a new memory allocation. Unfortunately Scop object have
a reference to its DetectionContext. When the DenseMap resizes, all the
DetectionContexts reference now point to invalid memory, even if caused
by an unrelated DetectionContext.

Even worse, NewPM's ScopPassManager called isMaxRegionInScop with the
Verify=true parameter before each pass. This caused the old
DetectionContext to be removed an a new on created and re-verified.
Of course, the Scop object was already created pointing to the old
DetectionContext. Because the new DetectionContext would
usually be stored at the same position in the DenseMap, the reference
would usually reference the new DetectionContext of the same Region.
Usually.
If not, the old position still points to memory in the DenseMap
allocation (unless also a resizing occurs) such that tools like Valgrind
and AddressSanitizer would not be able to diagnose this.

Instead of storing the DetectionContext inside the DenseMap, use a
std::unique_ptr to a DetectionContext allocation, i.e. it will not move
around anymore. This also allows use to remove the very strange

    DetectionContext(const DetectionContext &&)

copy/move(?) constructor. DetectionContext objects now are neither
copied nor moved.

As a result, every re-verification of a DetectionContext will use a new
allocation. Therefore, once a Scop object has been created using a
DetectionContext, it must not be re-verified (the Scop data structure
requires its underlying Region to not change before code generation
anyway). The NewPM may call isMaxRegionInScop only with
Validate=false parameter.
2021-02-13 03:36:09 -06:00
Michael Kruse d50f92a4f0 [Polly] Added dedicated test for working -O3 pipeline.
Test the NewPM as well as the legacy PM.
2021-02-10 13:25:56 -06:00
Michael Kruse 11511ee343 [Polly] Do not use -O3 pipeline for single pass test. 2021-02-10 13:25:56 -06:00
Michael Kruse e200df952b [Polly] Port IslScheduleOptimizer to the NewPM. 2021-02-09 23:56:21 -06:00
Michael Kruse b687fc9122 [Polly] Port PruneUnprofitable to the NewPM. 2021-02-09 23:56:20 -06:00
Michael Kruse 7903d594ea [Polly] Port DeLICM to the NewPM. 2021-02-09 23:56:19 -06:00
Michael Kruse 4c64d8ee3a [Polly] Port ForwardOpTree to the NewPM. 2021-02-09 23:56:19 -06:00
Michael Kruse 3dcb535115 [Polly] Remove use of -O3 in regression test.
In addition to that regression tests should not test the intire pass
pipeline (unless they are testing the pipeline itself), the Polly-ACC
currently does not support the new pass manager. If enabled by default,
such tests will therefore fail.

Use the -polly-gpu-runtime and -polly-gpu-arch options also as default
values for the PPCGCodeGeneration pass. This requires to move the option
to be moved from the pipeline-building Register passes to the
PPCGCodeGeneration implementation.

Fixes the spir-typesize.ll buildbot fail.
2021-02-09 18:13:35 -06:00
Arthur Eubanks 781a1b1e36 [test] Pin spir-codegen.ll to legacy PM
-polly-enable-delicm is not supported under the new PM but is tested here:
  Assertion `!EnableDeLICM && "This option is not implemented"' failed.
2021-02-03 19:37:32 -08:00
Michael Kruse 3b9677e1ec [Polly] Track defined behavior for PHI predecessor computation.
ZoneAlgorithms's computePHI relies on being provided with consistent a
schedule to compute the statement prodecessors of a statement containing
PHINodes. Otherwise unexpected results such as PHI nodes with multiple
predecessors can occur which would result in problems in the
algorithms expecting consistent data.

In the added test case, statement instances are scrubbed from the
SCoP their execution would result in undefined behavior (Due to a nsw
overflow). As already being undefined behavior in LLVM-IR, neither
AssumedContext nor InvalidContext are updated, giving computePHI no
means to avoid these cases.

Intoduce a new SCoP property, the DefinedBehaviorContext, that among
the runtime-checked conditions, also tracks the assumptions not needing
a runtime check, in particular those affecting the assumed control flow.
This replaces the manual combination of the 3 other contexts that was
already done in computePHI and setNewAccessRelation. Currently, the only
additional assumption is that loop induction variables will nsw flag for
not wrap, but potentially more can be added. Use in
hasFeasibleRuntimeContext, isl::ast_build and gisting are other
potential uses.

To limit computational complexity, the DefinedBehaviorContext is not
availabe if it grows too large (atm hardcoded to 8 disjuncts).

Possible other fixes include bailing out in computePHI when
inconsistencies are detected, choose an arbitrary value for inconsistent
cases (since it is undefined behavior anyways), or make the code
receiving the result from ComputePHI handle inconsistent data. All of
them reduce the quality of implementation having to bail out more often
and disabling the ability to assert on actually wrong results.

This fixes llvm.org/PR48783.
2021-01-23 13:03:49 -06:00
Michael Kruse a5b895110f [Polly] Gist new access relations using the SCoP context.
This simplifies the access relations.
2021-01-23 13:03:48 -06:00
Arthur Eubanks cabe1b1124 [polly][NewPM][test] Fix polly tests under -enable-new-pm
In preparation for turning on opt's -enable-new-pm by default, this pins
uses of passes via the legacy "opt -passname" with pass names beginning
with "polly-" and "polyhedral-info" to the legacy PM. Many of these
tests use -analyze, which isn't supported in the new PM.

(This doesn't affect uses of "opt -passes=passname").

rL240766 accidentally removed `-polly-prepare` in
phi_not_grouped_at_top.ll, and it also doesn't use the output of
-analyze.

Reviewed By: Meinersbur

Differential Revision: https://reviews.llvm.org/D94266
2021-01-19 12:38:58 -08:00
Michael Kruse 842314b5f0 [Polly] Update isl to isl-0.23-61-g24e8cd12.
This fixes llvm.org/PR48554

Some test cases had to be updated because the hash function for
union_maps have been changed which affects the output order.
2021-01-19 12:01:31 -06:00
Juneyoung Lee 278aa65cc4 [IR] Let IRBuilder's CreateVectorSplat/CreateShuffleVector use poison as placeholder
This patch updates IRBuilder to create insertelement/shufflevector using poison as a placeholder.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D93793
2020-12-30 04:21:04 +09:00
Michael Kruse bc633fe46b [Polly] Consider InvalidContext to determine partial READ.
MemoryAccess::setNewAccessRelation() in assert-builds checks whether the
access relation for a READ has a memory location for every instance of
the domain. Otherwise, we would not have value to load from. That check
already considered that instances outside the Scop's context do not
matter since they are never executed (or would be undefined behavior).
In this patch also take instances of the InvalidContext into account,
as these can also be assumed to never occur. InvalidContext was
introduced to avoid the computational complexity of subtracting
restrictions from the AssumedContext. However, this additional check in
setNewAccessRelation is only done in assert-builds.

The assertion case with an InvalidContext may occur with DeLICM on a
conditionally infinite loops, as it is the case in the following code:

    for (int i = 0; i < n; i+=b)
      vreg = ...;
    *Dest = vreg;

The loop is infinite when b=0, and [b] -> { : b = 0 }  is part of the
InvalidContext. When DeLICM tries to map the memory for %vreg to *Dest,
there is no store instance that uses the value of vreg when b = 0, hence
no location to map it to. However, the case is irrelevant since Polly's
runtime condition check ensures that this is never case.

Fixes llvm.org/PR48445
2020-12-10 22:25:19 -06:00
Michael Kruse 6249bfeefe [Polly][CodeGen] Remove use of ScalarEvolution.
ScalarEvolution::getSCEV cannot be used during codegen. ScalarEvolution
assumes a stable IR and control flow which is under construction during
Polly's CodeGen. In particular, it uses DominatorTree for compute the
backedge taken count. However the DominatorTree is not updated during
codegen.

In this case, SCEV was used to determine the base pointer of an array
access. Replace it by our own function. Polly generates only GEP and
BitCasts for array acceses, i.e. it is sufficient to handle these to to
find the base pointer.

Fixes llvm.org/PR48422
2020-12-07 15:21:51 -06:00
Michael Kruse c8a0e27cfb [Polly][OpTree] Fix mid-processing change of access kind.
Operand tree forwarding can cause the change of an access kind; in
particular change from a scalar kind to an array kind if the scalar
dependency is not necessary. Such an access cannot and doesn't need to
be forwarded anymore.

Fixes llvm.org/PR48034
2020-11-11 16:21:48 -06:00
Michael Kruse c1cf51e777 [Polly][OpTree] Better report applied changes.
Print to dbgs() any taken action.

Also, read-only scalars do not require any action unless
-polly-analyze-read-only-scalars=true is used. Better refect this by
using ForwardingAction::triviallyForwardable and thus not bumping the
statistics.
2020-11-11 16:21:48 -06:00
Michael Kruse e408935bb5 [Polly][ScopBuilder] Use only modeled instructions to compute statement granularity.
ScopBuilder distributes independent instructions between statements.
Only modeled (e.g. not synthesizable) instructions are represented.
To compute independence, non-modeled instructions were used in some
parts of determining instruction independence, which could lead to the
re-introduction of non-model instructions.

In particular, required invariant loads could be added to instruction
list, which then led to redundant MemoryAccesses for such a load.

This fixes llvm.org/PR48059.
2020-11-10 15:30:16 -06:00
Roman Lebedev b4916918e5
[SCEV] SCEVPtrToIntExpr simplifications
If we've got an SCEVPtrToIntExpr(op), where op is not an SCEVUnknown,
we want to sink the SCEVPtrToIntExpr into an operand,
so that the operation is performed on integers,
and eventually we end up with just an `SCEVPtrToIntExpr(SCEVUnknown)`.

Reviewed By: mkazantsev

Differential Revision: https://reviews.llvm.org/D89692
2020-10-30 11:13:35 +03:00
Roman Lebedev 81fc53a36a
[SCEV] Introduce SCEVPtrToIntExpr (PR46786)
And use it to model LLVM IR's `ptrtoint` cast.

This is essentially an alternative to D88806, but with no chance for
all the problems it caused due to having the cast as implicit there.
(see rG7ee6c402474a2f5fd21c403e7529f97f6362fdb3)

As we've established by now, there are at least two reasons why we want this:
* It will allow SCEV to actually model the `ptrtoint` casts
  and their operands, instead of treating them as `SCEVUnknown`
* It should help with initial problem of PR46786 - this should eventually allow us
  to not loose pointer-ness of an expression in more cases

As discussed in [[ https://bugs.llvm.org/show_bug.cgi?id=46786 | PR46786 ]], in principle,
we could just extend `SCEVUnknown` with a `is ptrtoint` cast, because `ScalarEvolution::getPtrToIntExpr()`
should sink the cast as far down into the expression as possible,
so in the end we should always end up with `SCEVPtrToIntExpr` of `SCEVUnknown`.

But i think that it isn't the best solution, because it doesn't really matter
from memory consumption side - there probably won't be *that* many `SCEVPtrToIntExpr`s
for it to matter, and it allows for much better discoverability.

Reviewed By: mkazantsev

Differential Revision: https://reviews.llvm.org/D89456
2020-10-30 11:13:35 +03:00