llvm-project

Commit Graph

Author	SHA1	Message	Date
Michael Kruse	ca5f05d2df	[Polly][test] Add dependency to count. Polly does not use the count program itself, but somewhere in lit it is expected to exists. Otherwise, the following error occurs: llvm-lit: llvm-project/llvm/utils/lit/lit/llvm/subst.py:133: fatal: Did not find count in ./bin	2021-08-28 22:50:07 -05:00
Michael Kruse	ffa39b4582	[Polly] Fix dumpfunction.ll test.	2021-08-28 22:43:07 -05:00
Michael Kruse	e4f3f2c0c5	[Polly] Don't prune non-external function itself from dump.	2021-08-28 17:06:53 -05:00
Michael Kruse	1537563104	[Polly][test] Add missing %loadPolly. This fixes check-polly when using the -load mechanism, i.e. LLVM_POLLY_LINK_INTO_TOOLS=OFF.	2021-08-24 13:47:25 -05:00
Michael Kruse	955b91c19c	[Polly] Never consider non-SCoP blocks as error blocks. Code outside the SCoP will be executed recardless of the code versioning runtime check introduced by CodeGeneration. Assumption made based on that these are never executed in Polly-optimized code does not hold. This fixes the miscompilation of MultiSource/Applications/lambda-0.1.3	2021-08-23 01:04:01 -05:00
Michael Kruse	9cfab5e249	[Polly] Add support for -polly-dump-before/after with NPM. The new pass manager does not allow adding module passes at the -polly-position=before-vectorizer extension point. Introduce a DumpFunctionPass that dumps only current function. In contrast to the legacy pass manager's -polly-dump-before, each function will be dumped into its own file. -polly-dump-before-file is still not supported. The DumpFunctionPass uses llvm::CloneModule to copy the current function into a new module and then write it into a file.	2021-08-22 20:43:35 -05:00
Eli Friedman	3f2828dc28	[polly] Fix up regression test config with current features. Primarily, configure substitutions so we can copy-paste the "RUN" line of failed tests without worrying about the paths.	2021-07-30 13:44:48 -07:00
Riccardo Mori	ec3da1a43f	Update isl to isl-0.24-69-g54aac5ac This is needed for having the functions isl_{set,map}_n_basic_{set,map} exported to the C++ interface. Some tests have been modified to reflect the isl changes.	2021-07-27 17:38:12 +02:00
Michael Kruse	84046ebd95	[Polly] Fix test after D104732. The SCEV analysis has been improved to identify a write access as a MustWrite.	2021-06-23 14:59:53 -05:00
Bjorn Pettersson	6aac2773d8	[polly][GPGPU] Fixup related to overloading exponent type in llvm.powi Commit `4c7f820b2b` changed the llvm.powi intrinsic to support different 'int' sizes for the exponent. That happened to break the IntrinsicToLibdeviceFunc mapping in PPCGCodeGeneration, which obviously should have been updated as part of commit `4c7f820b2b` (https://reviews.llvm.org/D99439). The shortcoming was found by buildbots that use -DPOLLY_ENABLE_GPGPU_CODEGEN=ON This patch should fixup the problem.	2021-06-18 08:59:06 +02:00
Michael Kruse	a56bd7dec8	[Polly][Matmul] Re-pack A in every iteration. Packed_A must be copied repeatedly, not just for the first iteration of the outer tile. This fixes llvm.org/PR50557	2021-06-09 15:19:52 -05:00
Eli Friedman	fd229caa01	[polly] Fix SCEVLoopAddRecRewriter to avoid invalid AddRecs. When we're remapping an AddRec, the AddRec constructed by a partial rewrite might not make sense. This triggers an assertion complaining it's not loop-invariant. Instead of constructing the partially rewritten AddRec, just skip straight to calling evaluateAtIteration. Testcase was automatically reduced using llvm-reduce, so it's a little messy, but hopefully makes sense. Differential Revision: https://reviews.llvm.org/D102959	2021-06-01 09:51:05 -07:00
serge-sans-paille	4ab3041acb	Revert "[NFC] remove explicit default value for strboolattr attribute in tests" This reverts commit `bda6e5bee0`. See https://lab.llvm.org/buildbot/#/builders/109/builds/15424 for instance	2021-05-24 19:43:40 +02:00
serge-sans-paille	bda6e5bee0	[NFC] remove explicit default value for strboolattr attribute in tests Since `d6de1e1a71`, no attributes is quivalent to setting attribute to false. This is a preliminary commit for https://reviews.llvm.org/D99080	2021-05-24 19:31:04 +02:00
Michael Kruse	ad568f4286	[Polly] Add support for -polly-dump-after(-file) with the NPM. For the same reason as with -polly-dump-before, it is only supported with -polly-position=early.	2021-05-17 22:20:47 -05:00
Michael Kruse	29bef8e4e3	[Polly] Add support for -polly-dump-before(-file) with the NPM. Only supported with -polly-position=early. Unfortunately, the extension point callpack for VectorizerStart only passes a FunctionPassManager, making it impossible to add a module pass.	2021-05-17 20:58:37 -05:00
Michael Kruse	5aafcb2b44	[Polly] Add support for -polly-position=early with the NPM. This required support for the canonicalization passes, inlcuding porting RewriteByReferenceParams to the NPM. For some reason, the legacy pass pipeline with -polly-position=early did not run the CodePreparation pass. This was fixed as well.	2021-05-14 12:55:03 -05:00
Michael Kruse	286677870b	[Polly][ManualOpt] Match interpretation of unroll metadata to LoopUnrolls's. We previously had a different interpretation of unroll transformation attributes than how LoopUnroll interpreted it. In particular, llvm.loop.unroll.enable was needed explicitly to enable it and disabling metadata was ignored. Additionally, it required that either full unrolling or an unroll factor to be specified or fail otherwise. An unroll factor is still required, but the transformation is ignored with the hope that LoopUnroll is going to apply the unrolling, since Polly currently does not implement an heuristic. Fixes llvm.org/PR50109	2021-04-24 04:30:19 -05:00
Roman Lebedev	2aff4f7f57	[polly] Fix check-polly after SCEVExpander PtrToInt fixes	2021-04-19 19:10:55 +03:00
Michael Kruse	8796451d6e	[Polly] Port DeadCodeElim to the NewPM.	2021-03-24 01:01:29 -05:00
Michael Kruse	f51427afb5	[Polly][Unroll] Fix unroll_double test. We enumerated the cross product Domain x Scatter, but sorted only be the scatter key. In case there are are multiple statement instances per scatter value, the order between statement instances of the same loop iteration was undefined. Propertly enumerate and sort only by the scatter value, and group the domains using the scatter dimension again. Thanks to Leonard Chan for the report.	2021-03-16 09:00:42 -05:00
Michael Kruse	3f170eb197	[Polly][Optimizer] Apply user-directed unrolling. Make Polly look for unrolling metadata (https://llvm.org/docs/TransformMetadata.html#loop-unrolling) that is usually only interpreted by the LoopUnroll pass and apply it to the SCoP's schedule. While not that useful by itself (there already is an unroll pass), it introduces mechanism to apply arbitrary loop transformation directives in arbitrary order to the schedule. Transformations are applied until no more directives are found. Since ISL's rescheduling would discard the manual transformations and it is assumed that when the user specifies the sequence of transformations, they do not want any other transformations to apply. Applying user-directed transformations can be controlled using the `-polly-pragma-based-opts` switch and is enabled by default. This does not influence the SCoP detection heuristic. As a consequence, loop that do not fulfill SCoP requirements or the initial profitability heuristic will be ignored. `-polly-process-unprofitable` can be used to disable the latter. Other than manually editing the IR, there is currently no way for the user to add loop transformations in an order other than the order in the default pipeline, or transformations other than the one supported by clang's LoopHint. See the `unroll_double.ll` test as example that clang currently is unable to emit. My own extension of `#pragma clang loop` allowing an arbitrary order and additional transformations is available here: https://github.com/meinersbur/llvm-project/tree/pragma-clang-loop. An effort to upstream this functionality as `#pragma clang transform` (because `#pragma clang loop` has an implicit transformation order defined by the loop pipeline) is D69088. Additional transformations from my downstream pragma-clang-loop branch are tiling, interchange, reversal, unroll-and-jam, thread-parallelization and array packing. Unroll was chosen because it uses already-defined metadata and does not require correctness checks. Reviewed By: sebastiankreutzer Differential Revision: https://reviews.llvm.org/D97977	2021-03-15 13:05:39 -05:00
Roman Lebedev	78b8ce40ef	Reland [SCEV] Improve modelling for (null) pointer constants This reverts commit `329aeb5db4`, and relands commit `61f006ac65`. This is a continuation of D89456. As it was suggested there, now that SCEV models `PtrToInt`, we can try to improve SCEV's pointer handling. In particular, i believe, i will need this in the future to further fix `SCEVAddExpr`operation type handling. This removes special handling of `ConstantPointerNull` from `ScalarEvolution::createSCEV()`, and add constant folding into `ScalarEvolution::getPtrToIntExpr()`. This way, `null` constants stay as such in SCEV's, but gracefully become zero integers when asked. Reviewed By: Meinersbur Differential Revision: https://reviews.llvm.org/D98147	2021-03-13 16:05:34 +03:00
Roman Lebedev	329aeb5db4	Temporairly evert "[SCEV] Improve modelling for (null) pointer constants" This appears to have broken ubsan bot: https://lab.llvm.org/buildbot/#/builders/85/builds/3062 https://reviews.llvm.org/D98147#2623549 It looks like LSR needs some kind of a change around insertion point handling. Reverting until i have a fix. This reverts commit `61f006ac65`.	2021-03-13 09:10:28 +03:00
Roman Lebedev	61f006ac65	[SCEV] Improve modelling for (null) pointer constants This is a continuation of D89456. As it was suggested there, now that SCEV models `PtrToInt`, we can try to improve SCEV's pointer handling. In particular, i believe, i will need this in the future to further fix `SCEVAddExpr`operation type handling. This removes special handling of `ConstantPointerNull` from `ScalarEvolution::createSCEV()`, and add constant folding into `ScalarEvolution::getPtrToIntExpr()`. This way, `null` constants stay as such in SCEV's, but gracefully become zero integers when asked. Reviewed By: Meinersbur Differential Revision: https://reviews.llvm.org/D98147	2021-03-12 22:11:58 +03:00
Roman Lebedev	f449e5ef9b	[NFCI] Fix polly tests after `b46c085d2b` That commit changed SCEVExpander to emit intrinsics instead of icmp+select, but i forgot about polly, and i'm not sure if any bots complained.	2021-03-07 20:44:04 +03:00
Michael Kruse	b85c98b4c5	[Polly][Codegen] Emit access group metadata. Emit llvm.loop.parallel_accesses metadata instead of llvm.mem.parallel_loop_access. The latter is deprecated because it assumes that LoopIDs are persistent, which they are not. We also emit parallel access metadata for all surrounding parallel loops, not just the innermost parallel.	2021-03-04 03:58:03 -06:00
Michael Kruse	91c472c86c	[Polly] Fix test after D96534.	2021-02-19 12:49:29 -06:00
Michael Kruse	089421ba9a	[Polly] Test all optimization levels.	2021-02-14 00:31:10 -06:00
Michael Kruse	95ef556bd1	[Polly] Preserve DetectionContext references. DetectionContext objects are stored as values in a DenseMap. When the DenseMap reaches its maximum load factor, it is resized and all its objects moved to a new memory allocation. Unfortunately Scop object have a reference to its DetectionContext. When the DenseMap resizes, all the DetectionContexts reference now point to invalid memory, even if caused by an unrelated DetectionContext. Even worse, NewPM's ScopPassManager called isMaxRegionInScop with the Verify=true parameter before each pass. This caused the old DetectionContext to be removed an a new on created and re-verified. Of course, the Scop object was already created pointing to the old DetectionContext. Because the new DetectionContext would usually be stored at the same position in the DenseMap, the reference would usually reference the new DetectionContext of the same Region. Usually. If not, the old position still points to memory in the DenseMap allocation (unless also a resizing occurs) such that tools like Valgrind and AddressSanitizer would not be able to diagnose this. Instead of storing the DetectionContext inside the DenseMap, use a std::unique_ptr to a DetectionContext allocation, i.e. it will not move around anymore. This also allows use to remove the very strange DetectionContext(const DetectionContext &&) copy/move(?) constructor. DetectionContext objects now are neither copied nor moved. As a result, every re-verification of a DetectionContext will use a new allocation. Therefore, once a Scop object has been created using a DetectionContext, it must not be re-verified (the Scop data structure requires its underlying Region to not change before code generation anyway). The NewPM may call isMaxRegionInScop only with Validate=false parameter.	2021-02-13 03:36:09 -06:00
Michael Kruse	d50f92a4f0	[Polly] Added dedicated test for working -O3 pipeline. Test the NewPM as well as the legacy PM.	2021-02-10 13:25:56 -06:00
Michael Kruse	11511ee343	[Polly] Do not use -O3 pipeline for single pass test.	2021-02-10 13:25:56 -06:00
Michael Kruse	e200df952b	[Polly] Port IslScheduleOptimizer to the NewPM.	2021-02-09 23:56:21 -06:00
Michael Kruse	b687fc9122	[Polly] Port PruneUnprofitable to the NewPM.	2021-02-09 23:56:20 -06:00
Michael Kruse	7903d594ea	[Polly] Port DeLICM to the NewPM.	2021-02-09 23:56:19 -06:00
Michael Kruse	4c64d8ee3a	[Polly] Port ForwardOpTree to the NewPM.	2021-02-09 23:56:19 -06:00
Michael Kruse	3dcb535115	[Polly] Remove use of -O3 in regression test. In addition to that regression tests should not test the intire pass pipeline (unless they are testing the pipeline itself), the Polly-ACC currently does not support the new pass manager. If enabled by default, such tests will therefore fail. Use the -polly-gpu-runtime and -polly-gpu-arch options also as default values for the PPCGCodeGeneration pass. This requires to move the option to be moved from the pipeline-building Register passes to the PPCGCodeGeneration implementation. Fixes the spir-typesize.ll buildbot fail.	2021-02-09 18:13:35 -06:00
Arthur Eubanks	781a1b1e36	[test] Pin spir-codegen.ll to legacy PM -polly-enable-delicm is not supported under the new PM but is tested here: Assertion `!EnableDeLICM && "This option is not implemented"' failed.	2021-02-03 19:37:32 -08:00
Michael Kruse	3b9677e1ec	[Polly] Track defined behavior for PHI predecessor computation. ZoneAlgorithms's computePHI relies on being provided with consistent a schedule to compute the statement prodecessors of a statement containing PHINodes. Otherwise unexpected results such as PHI nodes with multiple predecessors can occur which would result in problems in the algorithms expecting consistent data. In the added test case, statement instances are scrubbed from the SCoP their execution would result in undefined behavior (Due to a nsw overflow). As already being undefined behavior in LLVM-IR, neither AssumedContext nor InvalidContext are updated, giving computePHI no means to avoid these cases. Intoduce a new SCoP property, the DefinedBehaviorContext, that among the runtime-checked conditions, also tracks the assumptions not needing a runtime check, in particular those affecting the assumed control flow. This replaces the manual combination of the 3 other contexts that was already done in computePHI and setNewAccessRelation. Currently, the only additional assumption is that loop induction variables will nsw flag for not wrap, but potentially more can be added. Use in hasFeasibleRuntimeContext, isl::ast_build and gisting are other potential uses. To limit computational complexity, the DefinedBehaviorContext is not availabe if it grows too large (atm hardcoded to 8 disjuncts). Possible other fixes include bailing out in computePHI when inconsistencies are detected, choose an arbitrary value for inconsistent cases (since it is undefined behavior anyways), or make the code receiving the result from ComputePHI handle inconsistent data. All of them reduce the quality of implementation having to bail out more often and disabling the ability to assert on actually wrong results. This fixes llvm.org/PR48783.	2021-01-23 13:03:49 -06:00
Michael Kruse	a5b895110f	[Polly] Gist new access relations using the SCoP context. This simplifies the access relations.	2021-01-23 13:03:48 -06:00
Arthur Eubanks	cabe1b1124	[polly][NewPM][test] Fix polly tests under -enable-new-pm In preparation for turning on opt's -enable-new-pm by default, this pins uses of passes via the legacy "opt -passname" with pass names beginning with "polly-" and "polyhedral-info" to the legacy PM. Many of these tests use -analyze, which isn't supported in the new PM. (This doesn't affect uses of "opt -passes=passname"). rL240766 accidentally removed `-polly-prepare` in phi_not_grouped_at_top.ll, and it also doesn't use the output of -analyze. Reviewed By: Meinersbur Differential Revision: https://reviews.llvm.org/D94266	2021-01-19 12:38:58 -08:00
Michael Kruse	842314b5f0	[Polly] Update isl to isl-0.23-61-g24e8cd12. This fixes llvm.org/PR48554 Some test cases had to be updated because the hash function for union_maps have been changed which affects the output order.	2021-01-19 12:01:31 -06:00
Juneyoung Lee	278aa65cc4	[IR] Let IRBuilder's CreateVectorSplat/CreateShuffleVector use poison as placeholder This patch updates IRBuilder to create insertelement/shufflevector using poison as a placeholder. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D93793	2020-12-30 04:21:04 +09:00
Michael Kruse	bc633fe46b	[Polly] Consider InvalidContext to determine partial READ. MemoryAccess::setNewAccessRelation() in assert-builds checks whether the access relation for a READ has a memory location for every instance of the domain. Otherwise, we would not have value to load from. That check already considered that instances outside the Scop's context do not matter since they are never executed (or would be undefined behavior). In this patch also take instances of the InvalidContext into account, as these can also be assumed to never occur. InvalidContext was introduced to avoid the computational complexity of subtracting restrictions from the AssumedContext. However, this additional check in setNewAccessRelation is only done in assert-builds. The assertion case with an InvalidContext may occur with DeLICM on a conditionally infinite loops, as it is the case in the following code: for (int i = 0; i < n; i+=b) vreg = ...; Dest = vreg; The loop is infinite when b=0, and [b] -> { : b = 0 } is part of the InvalidContext. When DeLICM tries to map the memory for %vreg to Dest, there is no store instance that uses the value of vreg when b = 0, hence no location to map it to. However, the case is irrelevant since Polly's runtime condition check ensures that this is never case. Fixes llvm.org/PR48445	2020-12-10 22:25:19 -06:00
Michael Kruse	6249bfeefe	[Polly][CodeGen] Remove use of ScalarEvolution. ScalarEvolution::getSCEV cannot be used during codegen. ScalarEvolution assumes a stable IR and control flow which is under construction during Polly's CodeGen. In particular, it uses DominatorTree for compute the backedge taken count. However the DominatorTree is not updated during codegen. In this case, SCEV was used to determine the base pointer of an array access. Replace it by our own function. Polly generates only GEP and BitCasts for array acceses, i.e. it is sufficient to handle these to to find the base pointer. Fixes llvm.org/PR48422	2020-12-07 15:21:51 -06:00
Michael Kruse	c8a0e27cfb	[Polly][OpTree] Fix mid-processing change of access kind. Operand tree forwarding can cause the change of an access kind; in particular change from a scalar kind to an array kind if the scalar dependency is not necessary. Such an access cannot and doesn't need to be forwarded anymore. Fixes llvm.org/PR48034	2020-11-11 16:21:48 -06:00
Michael Kruse	c1cf51e777	[Polly][OpTree] Better report applied changes. Print to dbgs() any taken action. Also, read-only scalars do not require any action unless -polly-analyze-read-only-scalars=true is used. Better refect this by using ForwardingAction::triviallyForwardable and thus not bumping the statistics.	2020-11-11 16:21:48 -06:00
Michael Kruse	e408935bb5	[Polly][ScopBuilder] Use only modeled instructions to compute statement granularity. ScopBuilder distributes independent instructions between statements. Only modeled (e.g. not synthesizable) instructions are represented. To compute independence, non-modeled instructions were used in some parts of determining instruction independence, which could lead to the re-introduction of non-model instructions. In particular, required invariant loads could be added to instruction list, which then led to redundant MemoryAccesses for such a load. This fixes llvm.org/PR48059.	2020-11-10 15:30:16 -06:00
Roman Lebedev	b4916918e5	[SCEV] SCEVPtrToIntExpr simplifications If we've got an SCEVPtrToIntExpr(op), where op is not an SCEVUnknown, we want to sink the SCEVPtrToIntExpr into an operand, so that the operation is performed on integers, and eventually we end up with just an `SCEVPtrToIntExpr(SCEVUnknown)`. Reviewed By: mkazantsev Differential Revision: https://reviews.llvm.org/D89692	2020-10-30 11:13:35 +03:00
Roman Lebedev	81fc53a36a	[SCEV] Introduce SCEVPtrToIntExpr (PR46786) And use it to model LLVM IR's `ptrtoint` cast. This is essentially an alternative to D88806, but with no chance for all the problems it caused due to having the cast as implicit there. (see rG7ee6c402474a2f5fd21c403e7529f97f6362fdb3) As we've established by now, there are at least two reasons why we want this: * It will allow SCEV to actually model the `ptrtoint` casts and their operands, instead of treating them as `SCEVUnknown` * It should help with initial problem of PR46786 - this should eventually allow us to not loose pointer-ness of an expression in more cases As discussed in [[ https://bugs.llvm.org/show_bug.cgi?id=46786 \| PR46786 ]], in principle, we could just extend `SCEVUnknown` with a `is ptrtoint` cast, because `ScalarEvolution::getPtrToIntExpr()` should sink the cast as far down into the expression as possible, so in the end we should always end up with `SCEVPtrToIntExpr` of `SCEVUnknown`. But i think that it isn't the best solution, because it doesn't really matter from memory consumption side - there probably won't be that many `SCEVPtrToIntExpr`s for it to matter, and it allows for much better discoverability. Reviewed By: mkazantsev Differential Revision: https://reviews.llvm.org/D89456	2020-10-30 11:13:35 +03:00

1 2 3 4 5 ...

1469 Commits