llvm-project

Commit Graph

Author	SHA1	Message	Date
Michael Kruse	19db33c06e	[Polly] Remove support for code generated by gfortran+DragonEgg. DragonEgg is not maintained anymore, hence there is no need for this functionality. Fixes llvm.org/PR52173	2021-10-14 14:12:06 -05:00
Michael Kruse	203c7fab73	[Polly] Fix test case fixing the colon. Commit `573531fb1f` fixed the colon at the end of a CHECK line (was a semicolon by mistake). With the check enabled, it turned out that it was failing. Check for the correct content. Also add the missing colon to the next CHECK line.	2021-10-08 22:46:55 -05:00
Qiu Chaofan	573531fb1f	Fix typo of colon to semicolon in lit tests	2021-10-09 10:03:50 +08:00
Michael Kruse	64489255be	[Polly] Add greedy fusion algorithm. When the option -polly-loopfusion-greedy is set, the ScheduleOptimizer tries to aggressively fuse any band it can and does not violate any dependences. As part if the implementation, the functionalty for copying a band into an new schedule was extracted out of the ScheduleTreeRewriter.	2021-10-08 20:33:30 -05:00
Michael Kruse	cb879d00d8	[Polly] Completely remove -polly-opt-fusion. This was missing from `07e7cb9433`. The switch did nothing since then.	2021-10-08 02:10:34 -05:00
Philip Reames	d02db32644	[SCEV] Use full logic when infering flags on add and gep This is a followon to D109845. With that landed, we will have fixed all known instances of pr51817, and can thus start inferring flags more aggressively with greatly reduced risk of miscompiles. This patch simply applies the same inference logic used in that patch to our other major flag inference path. We can still do much better here (on both paths), but this is our first step. Differential Revision: https://reviews.llvm.org/D111003	2021-10-03 15:32:15 -07:00
Philip Reames	2ca8a3f213	[SCEV] Stop blindly propagating flags from inbound geps to SCEV nodes This fixes a violation of the wrap flag rules introduced in `c4048d8f`. This was also noted in the (very old) PR23527. The issue being fixed is that we assume the inbound flag on any GEP assumes that all users of any gep (or add) which happens to map to that SCEV would also be UB if the (other) gep overflowed. That's simply not true. In terms of the test diffs, I don't see anything seriously problematic. The lost flags are expected (given the semantic restriction on when its legal to tag the SCEV), and there are several cases where the previously inferred flags are unsound per the new semantics. The only common trend I noticed when looking at the deltas is that by not considering branch on poison as immediate UB in ValueTracking, we do miss a few cases we could reclaim. We may be able to claw some of these back with the follow ideas mentioned in PR51817. It's worth noting that most of the changes are analysis result only changes. The two transform changes are pretty minimal. In one case, we miss the opportunity to infer a nuw (correctly). In the other, we fail to fold an exit and produce a loop invariant form instead. This one is probably over-reduced as the program appears to be undefined in practice, and neither before or after exploits that. Differential Revision: https://reviews.llvm.org/D109789	2021-10-01 16:30:44 -07:00
Roman Gareev	113fa82c3c	[Polly] Check the properties of accesses to operands of a matrix-matrix multiplication The following code modifies elements of the array D. for (i = 0; i < _PB_NI; i++) for (j = 0; j < _PB_NJ; j++) { for (k = 0; k < _PB_NK; k++) { double Mul = A[i][k] * B[k][j]; D[i][j][k] += Mul; C[i][j] += Mul; } } Nevertheless, the code is recognised as a matrix-matrix multiplication, since the second and third dimensions of D are accessed with non-zero strides. This fixes the typo, which was made during the translation to C++ bindings (https://reviews.llvm.org/D35845). Reviewed By: Michael Kruse <llvm@meinersbur.de> Differential Revision: https://reviews.llvm.org/D110491	2021-09-28 22:58:57 +05:00
Michael Kruse	027c036663	[Polly] Reject regions entered by an indirectbr/callbr. SplitBlockPredecessors is unable to insert an additional BasicBlock between an indirectbr/callbr terminator and the successor blocks. This is needed by Polly to normalize the control flow before emitting its optimzed code. This patches rejects regions entered by an indirectbr/callbr to not fail later at code generation. This fixes llvm.org/PR51964 Recommit with "REQUIRES: asserts" in test that uses statistics.	2021-09-27 18:49:11 -05:00
Haowei Wu	283ed7de32	Revert "[Polly] Reject reject regions entered by an indirectbr/callbr." This reverts commit `91f46bb77e` which causes test failures when assertions are off.	2021-09-27 16:05:33 -07:00
Michael Kruse	91f46bb77e	[Polly] Reject reject regions entered by an indirectbr/callbr. SplitBlockPredecessors is unable to insert an additional BasicBlock between an indirectbr/callbr terminator and the successor blocks. This is needed by Polly to normalize the control flow before emitting its optimzed code. This patches rejects regions entered by an indirectbr/callbr to not fail later at code generation. This fixes llvm.org/PR51964	2021-09-26 21:21:50 -05:00
Michael Kruse	9820dd970c	[Polly] Support for InlineAsm. Inline assembly was not handled at all and treated like a llvm::Value. In particular, it tried to create a pointer it which is not allowed. Fix by handling like a llvm::Constant such that it is just reused when required, instead of trying to marshall it in memory. Fixes llvm.org/PR51960	2021-09-26 03:26:43 -05:00
Michael Kruse	d5c87162db	[Polly] Use VirtualUse to determine references. VirtualUse ensures consistency over different source of values with Polly. In particular, this enables its use of instructions moved between Statement. Before the patch, the code wrongly assumed that the BB's instructions are also the ScopStmt's instructions. Reference are determined for OpenMP outlining and GPGPU kernel extraction. GPGPU CodeGen had some problems. For one, it generated GPU kernel parameters for constants. Second, it emitted GPU-side invariant loads which have already been loaded by the host. This has been partially fixed, it still generates a store for the invariant load result, but using the value that the host has already written. WARNING: I did not test the generated PollyACC code on an actual GPU. The improved consistency will be made use of in the next patch.	2021-09-26 03:26:43 -05:00
Michael Kruse	1cea25eec9	[Polly] Remove isConstCall. The function was intended to catch OpenMP functions such as get_thread_id(). If matched, the call would be considered synthesizable. There were a few problems with this: * get_thread_id() is not 'const' in the sense of have the gcc manual defines it: "do not examine any values except their arguments". get_thread_id() reads OpenCL runtime libreary global state. What was inteded was probably 'speculable'. * isConstCall was implemented using mayReadOrWriteMemory(). 'const' is stricter than that, mayReadOrWriteMemory is e.g. true for malloc(), since it may only read/write addresses that are considered inaccessible fro the application. However, malloc is certainly not speculable. * Values that are isConstCall were not handled consistently throughout Polly. In particular, it was not considered for referenced values (OpenMP outlining and PollyACC). Fix by removing special handling for isConstCall entirely.	2021-09-26 03:26:43 -05:00
Michael Kruse	a5d47b3fa0	[Polly] Fix wrong redirect in test case.	2021-09-24 14:53:00 -05:00
Michael Kruse	e470f9268a	[Polly] Implement user-directed loop distribution/fission. This is a simple version without the possibility to define distribute points or followup-transformations. However, it is the first transformation that has to check whether the transformation is correct. It interprets the same metadata as the LoopDistribute pass. Re-apply after revert in `c7bcd72a38` with fix: Take isBand out of #ifndef NDEBUG since it now is used unconditionally.	2021-09-23 21:11:01 -05:00
Petr Hosek	c7bcd72a38	Revert "[Polly] Implement user-directed loop distribution/fission." This reverts commit `52c30adc7d` which breaks the build when NDEBUG is defined.	2021-09-23 14:04:25 -07:00
Michael Kruse	07e7cb9433	[Polly] Remove -polly-opt-fusion option. The name of the option is misleading and has been renamed by isl to "serialize-sccs". Instead of also renaming the option, remove it. The option is still accessible using -polly-isl-arg=--no-schedule-serialize-sccs	2021-09-23 15:43:08 -05:00
Michael Kruse	35f7020098	[Polly] Dissolve Isl test directory. NFC. All tests use ISL, integrate its subfolder into the components they belong to.	2021-09-22 17:45:07 -05:00
Michael Kruse	52c30adc7d	[Polly] Implement user-directed loop distribution/fission. This is a simple version without the possibility to define distribute points or followup-transformations. However, it is the first transformation that has to check whether the transformation is correct. It interprets the same metadata as the LoopDistribute pass.	2021-09-22 17:28:25 -05:00
Michael Kruse	cad9f98a2a	[Polly] Don't generate inter-iteration noalias metadata. This metadata was intended to mark all accesses within an iteration to be pairwise non-aliasing, in this case because every memory of a base pointer is touched (read or write) at most once. This is typical for 'sweeps' over all data. The stated motivation from D30606 is to ensure that unrolled iterations are considered non-aliasing. Rhe implemention had multiple issues: * The structure of the noalias metadata was malformed. D110026 added check in the verifier for this metadata, and the tests were failing since then. * This is not true for the outer loops of the BLIS matrix multiplication, where it was being inserted. Each element of A, B, C is accessed multiple times, as often as the loop not used as an index is iterating. * Scopes were added to SecondLevelOtherAliasScopeList (used for the !noalias scop list) on-the-fly when another SCEV was seen. This meant that previously visited instructions would not be updated with alias scopes that are only seen later, missing out those SCEVs they should not be aliasing with. * Since the !noalias scope list would ideally consists of all other SCEV for this base pointer, we might run quickly into scalability issues. Especially after unrolling there would probably at least once SCEV per instruction and unroll instance. * The inter-iteration noalias base pointer was not removed after leaving the loop marked with it, effectively marking everything after it to noalias as well. A solution I considered was to mark each instruction as non-aliasing with its own scope. The instruction itself would obviously alias itself, but such construction might also be considered invalid. Duplicating the instruction (e.g. due to speculation) would mark the instruction non-aliasing with its clone. I don't want to go into this territory, especially since the original motivation of determining unrolled instances as noalias based on SCEV is the what scev-aa does as well. This effectively reverts D30606 and D35761.	2021-09-20 22:20:17 -05:00
Nikita Popov	53720f74e4	[Polly] Partially fix scoped alias metadata This partially addresses the verifier failures caused by D110026. In particular, it does not fix the "second level" alias metadata.	2021-09-20 22:51:31 +02:00
Michael Kruse	ca5f05d2df	[Polly][test] Add dependency to count. Polly does not use the count program itself, but somewhere in lit it is expected to exists. Otherwise, the following error occurs: llvm-lit: llvm-project/llvm/utils/lit/lit/llvm/subst.py:133: fatal: Did not find count in ./bin	2021-08-28 22:50:07 -05:00
Michael Kruse	ffa39b4582	[Polly] Fix dumpfunction.ll test.	2021-08-28 22:43:07 -05:00
Michael Kruse	e4f3f2c0c5	[Polly] Don't prune non-external function itself from dump.	2021-08-28 17:06:53 -05:00
Michael Kruse	1537563104	[Polly][test] Add missing %loadPolly. This fixes check-polly when using the -load mechanism, i.e. LLVM_POLLY_LINK_INTO_TOOLS=OFF.	2021-08-24 13:47:25 -05:00
Michael Kruse	955b91c19c	[Polly] Never consider non-SCoP blocks as error blocks. Code outside the SCoP will be executed recardless of the code versioning runtime check introduced by CodeGeneration. Assumption made based on that these are never executed in Polly-optimized code does not hold. This fixes the miscompilation of MultiSource/Applications/lambda-0.1.3	2021-08-23 01:04:01 -05:00
Michael Kruse	9cfab5e249	[Polly] Add support for -polly-dump-before/after with NPM. The new pass manager does not allow adding module passes at the -polly-position=before-vectorizer extension point. Introduce a DumpFunctionPass that dumps only current function. In contrast to the legacy pass manager's -polly-dump-before, each function will be dumped into its own file. -polly-dump-before-file is still not supported. The DumpFunctionPass uses llvm::CloneModule to copy the current function into a new module and then write it into a file.	2021-08-22 20:43:35 -05:00
Eli Friedman	3f2828dc28	[polly] Fix up regression test config with current features. Primarily, configure substitutions so we can copy-paste the "RUN" line of failed tests without worrying about the paths.	2021-07-30 13:44:48 -07:00
Riccardo Mori	ec3da1a43f	Update isl to isl-0.24-69-g54aac5ac This is needed for having the functions isl_{set,map}_n_basic_{set,map} exported to the C++ interface. Some tests have been modified to reflect the isl changes.	2021-07-27 17:38:12 +02:00
Michael Kruse	84046ebd95	[Polly] Fix test after D104732. The SCEV analysis has been improved to identify a write access as a MustWrite.	2021-06-23 14:59:53 -05:00
Bjorn Pettersson	6aac2773d8	[polly][GPGPU] Fixup related to overloading exponent type in llvm.powi Commit `4c7f820b2b` changed the llvm.powi intrinsic to support different 'int' sizes for the exponent. That happened to break the IntrinsicToLibdeviceFunc mapping in PPCGCodeGeneration, which obviously should have been updated as part of commit `4c7f820b2b` (https://reviews.llvm.org/D99439). The shortcoming was found by buildbots that use -DPOLLY_ENABLE_GPGPU_CODEGEN=ON This patch should fixup the problem.	2021-06-18 08:59:06 +02:00
Michael Kruse	a56bd7dec8	[Polly][Matmul] Re-pack A in every iteration. Packed_A must be copied repeatedly, not just for the first iteration of the outer tile. This fixes llvm.org/PR50557	2021-06-09 15:19:52 -05:00
Eli Friedman	fd229caa01	[polly] Fix SCEVLoopAddRecRewriter to avoid invalid AddRecs. When we're remapping an AddRec, the AddRec constructed by a partial rewrite might not make sense. This triggers an assertion complaining it's not loop-invariant. Instead of constructing the partially rewritten AddRec, just skip straight to calling evaluateAtIteration. Testcase was automatically reduced using llvm-reduce, so it's a little messy, but hopefully makes sense. Differential Revision: https://reviews.llvm.org/D102959	2021-06-01 09:51:05 -07:00
serge-sans-paille	4ab3041acb	Revert "[NFC] remove explicit default value for strboolattr attribute in tests" This reverts commit `bda6e5bee0`. See https://lab.llvm.org/buildbot/#/builders/109/builds/15424 for instance	2021-05-24 19:43:40 +02:00
serge-sans-paille	bda6e5bee0	[NFC] remove explicit default value for strboolattr attribute in tests Since `d6de1e1a71`, no attributes is quivalent to setting attribute to false. This is a preliminary commit for https://reviews.llvm.org/D99080	2021-05-24 19:31:04 +02:00
Michael Kruse	ad568f4286	[Polly] Add support for -polly-dump-after(-file) with the NPM. For the same reason as with -polly-dump-before, it is only supported with -polly-position=early.	2021-05-17 22:20:47 -05:00
Michael Kruse	29bef8e4e3	[Polly] Add support for -polly-dump-before(-file) with the NPM. Only supported with -polly-position=early. Unfortunately, the extension point callpack for VectorizerStart only passes a FunctionPassManager, making it impossible to add a module pass.	2021-05-17 20:58:37 -05:00
Michael Kruse	5aafcb2b44	[Polly] Add support for -polly-position=early with the NPM. This required support for the canonicalization passes, inlcuding porting RewriteByReferenceParams to the NPM. For some reason, the legacy pass pipeline with -polly-position=early did not run the CodePreparation pass. This was fixed as well.	2021-05-14 12:55:03 -05:00
Michael Kruse	286677870b	[Polly][ManualOpt] Match interpretation of unroll metadata to LoopUnrolls's. We previously had a different interpretation of unroll transformation attributes than how LoopUnroll interpreted it. In particular, llvm.loop.unroll.enable was needed explicitly to enable it and disabling metadata was ignored. Additionally, it required that either full unrolling or an unroll factor to be specified or fail otherwise. An unroll factor is still required, but the transformation is ignored with the hope that LoopUnroll is going to apply the unrolling, since Polly currently does not implement an heuristic. Fixes llvm.org/PR50109	2021-04-24 04:30:19 -05:00
Roman Lebedev	2aff4f7f57	[polly] Fix check-polly after SCEVExpander PtrToInt fixes	2021-04-19 19:10:55 +03:00
Michael Kruse	8796451d6e	[Polly] Port DeadCodeElim to the NewPM.	2021-03-24 01:01:29 -05:00
Michael Kruse	f51427afb5	[Polly][Unroll] Fix unroll_double test. We enumerated the cross product Domain x Scatter, but sorted only be the scatter key. In case there are are multiple statement instances per scatter value, the order between statement instances of the same loop iteration was undefined. Propertly enumerate and sort only by the scatter value, and group the domains using the scatter dimension again. Thanks to Leonard Chan for the report.	2021-03-16 09:00:42 -05:00
Michael Kruse	3f170eb197	[Polly][Optimizer] Apply user-directed unrolling. Make Polly look for unrolling metadata (https://llvm.org/docs/TransformMetadata.html#loop-unrolling) that is usually only interpreted by the LoopUnroll pass and apply it to the SCoP's schedule. While not that useful by itself (there already is an unroll pass), it introduces mechanism to apply arbitrary loop transformation directives in arbitrary order to the schedule. Transformations are applied until no more directives are found. Since ISL's rescheduling would discard the manual transformations and it is assumed that when the user specifies the sequence of transformations, they do not want any other transformations to apply. Applying user-directed transformations can be controlled using the `-polly-pragma-based-opts` switch and is enabled by default. This does not influence the SCoP detection heuristic. As a consequence, loop that do not fulfill SCoP requirements or the initial profitability heuristic will be ignored. `-polly-process-unprofitable` can be used to disable the latter. Other than manually editing the IR, there is currently no way for the user to add loop transformations in an order other than the order in the default pipeline, or transformations other than the one supported by clang's LoopHint. See the `unroll_double.ll` test as example that clang currently is unable to emit. My own extension of `#pragma clang loop` allowing an arbitrary order and additional transformations is available here: https://github.com/meinersbur/llvm-project/tree/pragma-clang-loop. An effort to upstream this functionality as `#pragma clang transform` (because `#pragma clang loop` has an implicit transformation order defined by the loop pipeline) is D69088. Additional transformations from my downstream pragma-clang-loop branch are tiling, interchange, reversal, unroll-and-jam, thread-parallelization and array packing. Unroll was chosen because it uses already-defined metadata and does not require correctness checks. Reviewed By: sebastiankreutzer Differential Revision: https://reviews.llvm.org/D97977	2021-03-15 13:05:39 -05:00
Roman Lebedev	78b8ce40ef	Reland [SCEV] Improve modelling for (null) pointer constants This reverts commit `329aeb5db4`, and relands commit `61f006ac65`. This is a continuation of D89456. As it was suggested there, now that SCEV models `PtrToInt`, we can try to improve SCEV's pointer handling. In particular, i believe, i will need this in the future to further fix `SCEVAddExpr`operation type handling. This removes special handling of `ConstantPointerNull` from `ScalarEvolution::createSCEV()`, and add constant folding into `ScalarEvolution::getPtrToIntExpr()`. This way, `null` constants stay as such in SCEV's, but gracefully become zero integers when asked. Reviewed By: Meinersbur Differential Revision: https://reviews.llvm.org/D98147	2021-03-13 16:05:34 +03:00
Roman Lebedev	329aeb5db4	Temporairly evert "[SCEV] Improve modelling for (null) pointer constants" This appears to have broken ubsan bot: https://lab.llvm.org/buildbot/#/builders/85/builds/3062 https://reviews.llvm.org/D98147#2623549 It looks like LSR needs some kind of a change around insertion point handling. Reverting until i have a fix. This reverts commit `61f006ac65`.	2021-03-13 09:10:28 +03:00
Roman Lebedev	61f006ac65	[SCEV] Improve modelling for (null) pointer constants This is a continuation of D89456. As it was suggested there, now that SCEV models `PtrToInt`, we can try to improve SCEV's pointer handling. In particular, i believe, i will need this in the future to further fix `SCEVAddExpr`operation type handling. This removes special handling of `ConstantPointerNull` from `ScalarEvolution::createSCEV()`, and add constant folding into `ScalarEvolution::getPtrToIntExpr()`. This way, `null` constants stay as such in SCEV's, but gracefully become zero integers when asked. Reviewed By: Meinersbur Differential Revision: https://reviews.llvm.org/D98147	2021-03-12 22:11:58 +03:00
Roman Lebedev	f449e5ef9b	[NFCI] Fix polly tests after `b46c085d2b` That commit changed SCEVExpander to emit intrinsics instead of icmp+select, but i forgot about polly, and i'm not sure if any bots complained.	2021-03-07 20:44:04 +03:00
Michael Kruse	b85c98b4c5	[Polly][Codegen] Emit access group metadata. Emit llvm.loop.parallel_accesses metadata instead of llvm.mem.parallel_loop_access. The latter is deprecated because it assumes that LoopIDs are persistent, which they are not. We also emit parallel access metadata for all surrounding parallel loops, not just the innermost parallel.	2021-03-04 03:58:03 -06:00
Michael Kruse	91c472c86c	[Polly] Fix test after D96534.	2021-02-19 12:49:29 -06:00

1 2 3 4 5 ...

1491 Commits