Commit Graph

1505 Commits

Author SHA1 Message Date
Michael Kruse 6b3b87376b [polly] migrate -polly-show to the new pass manager
Reviewed By: Meinersbur

Differential Revision: https://reviews.llvm.org/D123678
2022-05-09 14:04:29 -05:00
Michael Kruse 809ca66eac [Polly] Fix test after D119669. 2022-05-01 13:32:42 -05:00
Arthur Eubanks caf6af2ed7 [polly] Remove last instances of -analyze
As mentioned in D120782, the loop block order can be different depending
on if LoopInfo is incrementally updated or freshly computed.

Reviewed By: Meinersbur

Differential Revision: https://reviews.llvm.org/D122195
2022-03-24 09:47:43 -07:00
Michael Kruse 12ac339e9e [polly] Fix NPM unittests after D121566. 2022-03-18 14:25:44 -05:00
Wael Yehia c80198b3d3 Reland "Load pass plugins during option processing, so that plugin options are registered and live."
Fix Polly failures.

Reviewed By: mehdi_amini, Meinersbur

Differential Revision: https://reviews.llvm.org/D121566
2022-03-18 03:27:53 +00:00
Sam McCall 75acad41bc Use lit_config.substitute instead of foo % lit_config.params everywhere
This mechanically applies the same changes from D121427 everywhere.

Differential Revision: https://reviews.llvm.org/D121746
2022-03-16 09:57:41 +01:00
Michael Kruse 5c02808131 [polly] Introduce -polly-print-* passes to replace -analyze.
The `opt -analyze` option only works with the legacy pass manager and might be removed in the future, as explained in llvm.org/PR53733. This patch introduced -polly-print-* passes that print what the pass would print with the `-analyze` option and replaces all uses of `-analyze` in the regression tests.

There are two exceptions: `CodeGen\single_loop_param_less_equal.ll` and `CodeGen\loop_with_condition_nested.ll` use `-analyze on the `-loops` pass which is not part of Polly.

Reviewed By: aeubanks

Differential Revision: https://reviews.llvm.org/D120782
2022-03-14 10:27:15 -05:00
Petr Hosek 0c0f6cfb7b [CMake] Rename TARGET_TRIPLE to LLVM_TARGET_TRIPLE
This clarifies that this is an LLVM specific variable and avoids
potential conflicts with other projects.

Differential Revision: https://reviews.llvm.org/D119918
2022-03-11 15:43:01 -08:00
Arthur Eubanks 30f1cef86b Revert "[polly] Fix regression test after D110620."
This reverts commit 2aa624a94f.

D110620 was reverted.
2022-03-04 20:37:15 -08:00
Michael Kruse d7851685a3 [polly] Remove trailing whitespace from tests. NFC. 2022-02-22 15:41:13 -06:00
Michael Kruse 2aa624a94f [polly] Fix regression test after D110620. 2022-02-17 09:47:19 -06:00
Roman Lebedev 4d0c0e6cc2
[SCEV] `createNodeForSelectOrPHIInstWithICmpInstCond()`: generalize eq handling
The current logic was: https://alive2.llvm.org/ce/z/j8muXk
but in reality the offset to the Y in the 'true' hand
does not need to exist: https://alive2.llvm.org/ce/z/MNQ7DZ
https://alive2.llvm.org/ce/z/S2pMQD

To catch that, instead of computing the Y's in both
hands and checking their equality, compute Y and C,
and check that C is 0 or 1.
2022-02-11 21:58:19 +03:00
Florian Hahn 782c0dd1a1
[IRBuilder] Migrate and-folding to value-based FoldAnd.
Similar to the migration of or-folding to FoldOr, there are a few cases
where the fold in IRBuilder::CreateAnd triggered directly. Those have
been updated.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D117431
2022-01-20 10:22:21 +00:00
Michael Kruse 937b00ab2c [Polly][SchedOpt] Account for prevectorization of multiple statements.
A prevectorized loop may contain multiple statements, in which case
isl_schedule_node_band_sink will sink the vector band to multiple
leaves. Instead of statically assuming a specific tree structure after
sinking, add a SIMD marker to all inner bands.

Fixes llvm.org/PR52637
2021-12-23 14:06:41 -06:00
Michael Kruse 19db33c06e [Polly] Remove support for code generated by gfortran+DragonEgg.
DragonEgg is not maintained anymore, hence there is no need for this
functionality.

Fixes llvm.org/PR52173
2021-10-14 14:12:06 -05:00
Michael Kruse 203c7fab73 [Polly] Fix test case fixing the colon.
Commit 573531fb1f fixed the colon at the
end of a CHECK line (was a semicolon by mistake). With the check
enabled, it turned out that it was failing. Check for the correct
content.

Also add the missing colon to the next CHECK line.
2021-10-08 22:46:55 -05:00
Qiu Chaofan 573531fb1f Fix typo of colon to semicolon in lit tests 2021-10-09 10:03:50 +08:00
Michael Kruse 64489255be [Polly] Add greedy fusion algorithm.
When the option -polly-loopfusion-greedy is set, the ScheduleOptimizer
tries to aggressively fuse any band it can and does not violate any
dependences.

As part if the implementation, the functionalty for copying a band
into an new schedule was extracted out of the ScheduleTreeRewriter.
2021-10-08 20:33:30 -05:00
Michael Kruse cb879d00d8 [Polly] Completely remove -polly-opt-fusion.
This was missing from 07e7cb9433.
The switch did nothing since then.
2021-10-08 02:10:34 -05:00
Philip Reames d02db32644 [SCEV] Use full logic when infering flags on add and gep
This is a followon to D109845. With that landed, we will have fixed all known instances of pr51817, and can thus start inferring flags more aggressively with greatly reduced risk of miscompiles. This patch simply applies the same inference logic used in that patch to our other major flag inference path.

We can still do much better here (on both paths), but this is our first step.

Differential Revision: https://reviews.llvm.org/D111003
2021-10-03 15:32:15 -07:00
Philip Reames 2ca8a3f213 [SCEV] Stop blindly propagating flags from inbound geps to SCEV nodes
This fixes a violation of the wrap flag rules introduced in c4048d8f. This was also noted in the (very old) PR23527.

The issue being fixed is that we assume the inbound flag on any GEP assumes that all users of *any* gep (or add) which happens to map to that SCEV would also be UB if the (other) gep overflowed. That's simply not true.

In terms of the test diffs, I don't see anything seriously problematic. The lost flags are expected (given the semantic restriction on when its legal to tag the SCEV), and there are several cases where the previously inferred flags are unsound per the new semantics.

The only common trend I noticed when looking at the deltas is that by not considering branch on poison as immediate UB in ValueTracking, we do miss a few cases we could reclaim. We may be able to claw some of these back with the follow ideas mentioned in PR51817.

It's worth noting that most of the changes are analysis result only changes. The two transform changes are pretty minimal. In one case, we miss the opportunity to infer a nuw (correctly). In the other, we fail to fold an exit and produce a loop invariant form instead. This one is probably over-reduced as the program appears to be undefined in practice, and neither before or after exploits that.

Differential Revision: https://reviews.llvm.org/D109789
2021-10-01 16:30:44 -07:00
Roman Gareev 113fa82c3c [Polly] Check the properties of accesses to operands of a matrix-matrix
multiplication

The following code modifies elements of the array D.

    for (i = 0; i < _PB_NI; i++)
      for (j = 0; j < _PB_NJ; j++)
      {
        for (k = 0; k < _PB_NK; k++)
        {
          double Mul = A[i][k] * B[k][j];
          D[i][j][k] += Mul;
          C[i][j] += Mul;
        }
      }

Nevertheless, the code is recognised as a matrix-matrix multiplication, since
the second and third dimensions of D are accessed with non-zero strides.

This fixes the typo, which was made during the translation to C++ bindings
(https://reviews.llvm.org/D35845).

Reviewed By: Michael Kruse <llvm@meinersbur.de>

Differential Revision: https://reviews.llvm.org/D110491
2021-09-28 22:58:57 +05:00
Michael Kruse 027c036663 [Polly] Reject regions entered by an indirectbr/callbr.
SplitBlockPredecessors is unable to insert an additional BasicBlock
between an indirectbr/callbr terminator and the successor blocks.
This is needed by Polly to normalize the control flow before emitting
its optimzed code.

This patches rejects regions entered by an indirectbr/callbr to not fail
later at code generation.

This fixes llvm.org/PR51964

Recommit with "REQUIRES: asserts" in test that uses statistics.
2021-09-27 18:49:11 -05:00
Haowei Wu 283ed7de32 Revert "[Polly] Reject reject regions entered by an indirectbr/callbr."
This reverts commit 91f46bb77e which
causes test failures when assertions are off.
2021-09-27 16:05:33 -07:00
Michael Kruse 91f46bb77e [Polly] Reject reject regions entered by an indirectbr/callbr.
SplitBlockPredecessors is unable to insert an additional BasicBlock
between an indirectbr/callbr terminator and the successor blocks.
This is needed by Polly to normalize the control flow before emitting
its optimzed code.

This patches rejects regions entered by an indirectbr/callbr to not fail
later at code generation.

This fixes llvm.org/PR51964
2021-09-26 21:21:50 -05:00
Michael Kruse 9820dd970c [Polly] Support for InlineAsm.
Inline assembly was not handled at all and treated like a llvm::Value.
In particular, it tried to create a pointer it which is not allowed.

Fix by handling like a llvm::Constant such that it is just reused when
required, instead of trying to marshall it in memory.

Fixes llvm.org/PR51960
2021-09-26 03:26:43 -05:00
Michael Kruse d5c87162db [Polly] Use VirtualUse to determine references.
VirtualUse ensures consistency over different source of values with
Polly. In particular, this enables its use of instructions moved between
Statement. Before the patch, the code wrongly assumed that the BB's
instructions are also the ScopStmt's instructions. Reference are
determined for OpenMP outlining and GPGPU kernel extraction.

GPGPU CodeGen had some problems. For one, it generated GPU kernel
parameters for constants. Second, it emitted GPU-side invariant loads
which have already been loaded by the host. This has been partially
fixed, it still generates a store for the invariant load result, but
using the value that the host has already written.

WARNING: I did not test the generated PollyACC code on an actual GPU.

The improved consistency will be made use of in the next patch.
2021-09-26 03:26:43 -05:00
Michael Kruse 1cea25eec9 [Polly] Remove isConstCall.
The function was intended to catch OpenMP functions such as
get_thread_id(). If matched, the call would be considered synthesizable.

There were a few problems with this:

 * get_thread_id() is not 'const' in the sense of have the gcc manual
   defines it: "do not examine any values except their arguments".
   get_thread_id() reads OpenCL runtime libreary global state.
   What was inteded was probably 'speculable'.

 * isConstCall was implemented using mayReadOrWriteMemory(). 'const' is
   stricter than that, mayReadOrWriteMemory is e.g. true for malloc(),
   since it may only read/write addresses that are considered
   inaccessible fro the application. However, malloc is certainly not
   speculable.

 * Values that are isConstCall were not handled consistently throughout
   Polly. In particular, it was not considered for referenced values
   (OpenMP outlining and PollyACC).

Fix by removing special handling for isConstCall entirely.
2021-09-26 03:26:43 -05:00
Michael Kruse a5d47b3fa0 [Polly] Fix wrong redirect in test case. 2021-09-24 14:53:00 -05:00
Michael Kruse e470f9268a [Polly] Implement user-directed loop distribution/fission.
This is a simple version without the possibility to define distribute
points or followup-transformations. However, it is the first
transformation that has to check whether the transformation is correct.

It interprets the same metadata as the LoopDistribute pass.

Re-apply after revert in c7bcd72a38 with
fix: Take isBand out of #ifndef NDEBUG since it now is used
unconditionally.
2021-09-23 21:11:01 -05:00
Petr Hosek c7bcd72a38 Revert "[Polly] Implement user-directed loop distribution/fission."
This reverts commit 52c30adc7d which
breaks the build when NDEBUG is defined.
2021-09-23 14:04:25 -07:00
Michael Kruse 07e7cb9433 [Polly] Remove -polly-opt-fusion option.
The name of the option is misleading and has been renamed by isl to
"serialize-sccs". Instead of also renaming the option, remove it.
The option is still accessible using

    -polly-isl-arg=--no-schedule-serialize-sccs
2021-09-23 15:43:08 -05:00
Michael Kruse 35f7020098 [Polly] Dissolve Isl test directory. NFC.
All tests use ISL, integrate its subfolder into the components they
belong to.
2021-09-22 17:45:07 -05:00
Michael Kruse 52c30adc7d [Polly] Implement user-directed loop distribution/fission.
This is a simple version without the possibility to define distribute
points or followup-transformations. However, it is the first
transformation that has to check whether the transformation is correct.

It interprets the same metadata as the LoopDistribute pass.
2021-09-22 17:28:25 -05:00
Michael Kruse cad9f98a2a [Polly] Don't generate inter-iteration noalias metadata.
This metadata was intended to mark all accesses within an iteration to be pairwise non-aliasing, in this case because every memory of a base pointer is touched (read or write) at most once. This is typical for 'sweeps' over all data. The stated motivation from D30606 is to ensure that unrolled iterations are considered non-aliasing.

Rhe implemention had multiple issues:

 * The structure of the noalias metadata was malformed. D110026 added check in the verifier for this metadata, and the tests were failing since then.

 * This is not true for the outer loops of the BLIS matrix multiplication, where it was being inserted. Each element of A, B, C is accessed multiple times, as often as the loop not used as an index is iterating.

 * Scopes were added to SecondLevelOtherAliasScopeList (used for the !noalias scop list) on-the-fly when another SCEV was seen. This meant that previously visited instructions would not be updated with alias scopes that are only seen later, missing out those SCEVs they should not be aliasing with.

 * Since the !noalias scope list would ideally consists of all other SCEV for this base pointer, we might run quickly into scalability issues. Especially after unrolling there would probably at least once SCEV per instruction and unroll instance.

 * The inter-iteration noalias base pointer was not removed after leaving the loop marked with it, effectively marking everything after it to noalias as well.

A solution I considered was to mark each instruction as non-aliasing with its own scope. The instruction itself would obviously alias itself, but such construction might also be considered invalid. Duplicating the instruction (e.g. due to speculation) would mark the instruction non-aliasing with its clone. I don't want to go into this territory, especially since the original motivation of determining unrolled instances as noalias based on SCEV is the what scev-aa does as well.

This effectively reverts D30606 and D35761.
2021-09-20 22:20:17 -05:00
Nikita Popov 53720f74e4 [Polly] Partially fix scoped alias metadata
This partially addresses the verifier failures caused by D110026.
In particular, it does not fix the "second level" alias metadata.
2021-09-20 22:51:31 +02:00
Michael Kruse ca5f05d2df [Polly][test] Add dependency to count.
Polly does not use the count program itself, but somewhere in lit it is
expected to exists. Otherwise, the following error occurs:

    llvm-lit: llvm-project/llvm/utils/lit/lit/llvm/subst.py:133: fatal: Did not find count in ./bin
2021-08-28 22:50:07 -05:00
Michael Kruse ffa39b4582 [Polly] Fix dumpfunction.ll test. 2021-08-28 22:43:07 -05:00
Michael Kruse e4f3f2c0c5 [Polly] Don't prune non-external function itself from dump. 2021-08-28 17:06:53 -05:00
Michael Kruse 1537563104 [Polly][test] Add missing %loadPolly.
This fixes check-polly when using the -load mechanism,
i.e. LLVM_POLLY_LINK_INTO_TOOLS=OFF.
2021-08-24 13:47:25 -05:00
Michael Kruse 955b91c19c [Polly] Never consider non-SCoP blocks as error blocks.
Code outside the SCoP will be executed recardless of the code versioning
runtime check introduced by CodeGeneration. Assumption made based on
that these are never executed in Polly-optimized code does not hold.

This fixes the miscompilation of MultiSource/Applications/lambda-0.1.3
2021-08-23 01:04:01 -05:00
Michael Kruse 9cfab5e249 [Polly] Add support for -polly-dump-before/after with NPM.
The new pass manager does not allow adding module passes at the
-polly-position=before-vectorizer extension point. Introduce a
DumpFunctionPass that dumps only current function. In contrast to the
legacy pass manager's -polly-dump-before, each function will be dumped
into its own file. -polly-dump-before-file is still not supported.

The DumpFunctionPass uses llvm::CloneModule to copy the current function
into a new module and then write it into a file.
2021-08-22 20:43:35 -05:00
Eli Friedman 3f2828dc28 [polly] Fix up regression test config with current features.
Primarily, configure substitutions so we can copy-paste the "RUN" line
of failed tests without worrying about the paths.
2021-07-30 13:44:48 -07:00
Riccardo Mori ec3da1a43f Update isl to isl-0.24-69-g54aac5ac
This is needed for having the functions isl_{set,map}_n_basic_{set,map}
exported to the C++ interface.
Some tests have been modified to reflect the isl changes.
2021-07-27 17:38:12 +02:00
Michael Kruse 84046ebd95 [Polly] Fix test after D104732.
The SCEV analysis has been improved to identify a write access as a MustWrite.
2021-06-23 14:59:53 -05:00
Bjorn Pettersson 6aac2773d8 [polly][GPGPU] Fixup related to overloading exponent type in llvm.powi
Commit 4c7f820b2b changed the llvm.powi intrinsic to support
different 'int' sizes for the exponent. That happened to break
the IntrinsicToLibdeviceFunc mapping in PPCGCodeGeneration, which
obviously should have been updated as part of commit 4c7f820b2b
(https://reviews.llvm.org/D99439).

The shortcoming was found by buildbots that use
   -DPOLLY_ENABLE_GPGPU_CODEGEN=ON

This patch should fixup the problem.
2021-06-18 08:59:06 +02:00
Michael Kruse a56bd7dec8 [Polly][Matmul] Re-pack A in every iteration.
Packed_A must be copied repeatedly, not just for the first iteration of
the outer tile.

This fixes llvm.org/PR50557
2021-06-09 15:19:52 -05:00
Eli Friedman fd229caa01 [polly] Fix SCEVLoopAddRecRewriter to avoid invalid AddRecs.
When we're remapping an AddRec, the AddRec constructed by a partial
rewrite might not make sense.  This triggers an assertion complaining
it's not loop-invariant.

Instead of constructing the partially rewritten AddRec, just skip
straight to calling evaluateAtIteration.

Testcase was automatically reduced using llvm-reduce, so it's a little
messy, but hopefully makes sense.

Differential Revision: https://reviews.llvm.org/D102959
2021-06-01 09:51:05 -07:00
serge-sans-paille 4ab3041acb Revert "[NFC] remove explicit default value for strboolattr attribute in tests"
This reverts commit bda6e5bee0.

See https://lab.llvm.org/buildbot/#/builders/109/builds/15424 for instance
2021-05-24 19:43:40 +02:00
serge-sans-paille bda6e5bee0 [NFC] remove explicit default value for strboolattr attribute in tests
Since d6de1e1a71, no attributes is quivalent to
setting attribute to false.

This is a preliminary commit for https://reviews.llvm.org/D99080
2021-05-24 19:31:04 +02:00