llvm-project

Commit Graph

Author	SHA1	Message	Date
Reid Kleckner	46ef2e0bf9	Update polly for removal of CallInst::arg_operands/getNumArgOperands Fixes polly build	2021-10-08 10:46:05 -07:00
Michael Kruse	cb879d00d8	[Polly] Completely remove -polly-opt-fusion. This was missing from `07e7cb9433`. The switch did nothing since then.	2021-10-08 02:10:34 -05:00
Simon Pilgrim	f1be391bed	[polly] Replace report_fatal_error(std::string) uses with report_fatal_error(Twine) As described on D111049, we're trying to remove the <string> dependency from error handling and replace uses of report_fatal_error(const std::string&) with the Twine() variant which can be forward declared.	2021-10-06 13:32:57 +01:00
Christopher Tetreault	67acc772d0	[NFC] Fix build failure in ScopDetection In some build environments, the C++ compiler is unable to infer the correct type for the DenseMap::insert in isErrorBlock. Typing out std::make_pair helps.	2021-10-04 09:19:27 -07:00
Philip Reames	d02db32644	[SCEV] Use full logic when infering flags on add and gep This is a followon to D109845. With that landed, we will have fixed all known instances of pr51817, and can thus start inferring flags more aggressively with greatly reduced risk of miscompiles. This patch simply applies the same inference logic used in that patch to our other major flag inference path. We can still do much better here (on both paths), but this is our first step. Differential Revision: https://reviews.llvm.org/D111003	2021-10-03 15:32:15 -07:00
Philip Reames	2ca8a3f213	[SCEV] Stop blindly propagating flags from inbound geps to SCEV nodes This fixes a violation of the wrap flag rules introduced in `c4048d8f`. This was also noted in the (very old) PR23527. The issue being fixed is that we assume the inbound flag on any GEP assumes that all users of any gep (or add) which happens to map to that SCEV would also be UB if the (other) gep overflowed. That's simply not true. In terms of the test diffs, I don't see anything seriously problematic. The lost flags are expected (given the semantic restriction on when its legal to tag the SCEV), and there are several cases where the previously inferred flags are unsound per the new semantics. The only common trend I noticed when looking at the deltas is that by not considering branch on poison as immediate UB in ValueTracking, we do miss a few cases we could reclaim. We may be able to claw some of these back with the follow ideas mentioned in PR51817. It's worth noting that most of the changes are analysis result only changes. The two transform changes are pretty minimal. In one case, we miss the opportunity to infer a nuw (correctly). In the other, we fail to fold an exit and produce a loop invariant form instead. This one is probably over-reduced as the program appears to be undefined in practice, and neither before or after exploits that. Differential Revision: https://reviews.llvm.org/D109789	2021-10-01 16:30:44 -07:00
Roman Gareev	113fa82c3c	[Polly] Check the properties of accesses to operands of a matrix-matrix multiplication The following code modifies elements of the array D. for (i = 0; i < _PB_NI; i++) for (j = 0; j < _PB_NJ; j++) { for (k = 0; k < _PB_NK; k++) { double Mul = A[i][k] * B[k][j]; D[i][j][k] += Mul; C[i][j] += Mul; } } Nevertheless, the code is recognised as a matrix-matrix multiplication, since the second and third dimensions of D are accessed with non-zero strides. This fixes the typo, which was made during the translation to C++ bindings (https://reviews.llvm.org/D35845). Reviewed By: Michael Kruse <llvm@meinersbur.de> Differential Revision: https://reviews.llvm.org/D110491	2021-09-28 22:58:57 +05:00
Michael Kruse	027c036663	[Polly] Reject regions entered by an indirectbr/callbr. SplitBlockPredecessors is unable to insert an additional BasicBlock between an indirectbr/callbr terminator and the successor blocks. This is needed by Polly to normalize the control flow before emitting its optimzed code. This patches rejects regions entered by an indirectbr/callbr to not fail later at code generation. This fixes llvm.org/PR51964 Recommit with "REQUIRES: asserts" in test that uses statistics.	2021-09-27 18:49:11 -05:00
Haowei Wu	283ed7de32	Revert "[Polly] Reject reject regions entered by an indirectbr/callbr." This reverts commit `91f46bb77e` which causes test failures when assertions are off.	2021-09-27 16:05:33 -07:00
Michael Kruse	91f46bb77e	[Polly] Reject reject regions entered by an indirectbr/callbr. SplitBlockPredecessors is unable to insert an additional BasicBlock between an indirectbr/callbr terminator and the successor blocks. This is needed by Polly to normalize the control flow before emitting its optimzed code. This patches rejects regions entered by an indirectbr/callbr to not fail later at code generation. This fixes llvm.org/PR51964	2021-09-26 21:21:50 -05:00
Michael Kruse	9820dd970c	[Polly] Support for InlineAsm. Inline assembly was not handled at all and treated like a llvm::Value. In particular, it tried to create a pointer it which is not allowed. Fix by handling like a llvm::Constant such that it is just reused when required, instead of trying to marshall it in memory. Fixes llvm.org/PR51960	2021-09-26 03:26:43 -05:00
Michael Kruse	d5c87162db	[Polly] Use VirtualUse to determine references. VirtualUse ensures consistency over different source of values with Polly. In particular, this enables its use of instructions moved between Statement. Before the patch, the code wrongly assumed that the BB's instructions are also the ScopStmt's instructions. Reference are determined for OpenMP outlining and GPGPU kernel extraction. GPGPU CodeGen had some problems. For one, it generated GPU kernel parameters for constants. Second, it emitted GPU-side invariant loads which have already been loaded by the host. This has been partially fixed, it still generates a store for the invariant load result, but using the value that the host has already written. WARNING: I did not test the generated PollyACC code on an actual GPU. The improved consistency will be made use of in the next patch.	2021-09-26 03:26:43 -05:00
Michael Kruse	1cea25eec9	[Polly] Remove isConstCall. The function was intended to catch OpenMP functions such as get_thread_id(). If matched, the call would be considered synthesizable. There were a few problems with this: * get_thread_id() is not 'const' in the sense of have the gcc manual defines it: "do not examine any values except their arguments". get_thread_id() reads OpenCL runtime libreary global state. What was inteded was probably 'speculable'. * isConstCall was implemented using mayReadOrWriteMemory(). 'const' is stricter than that, mayReadOrWriteMemory is e.g. true for malloc(), since it may only read/write addresses that are considered inaccessible fro the application. However, malloc is certainly not speculable. * Values that are isConstCall were not handled consistently throughout Polly. In particular, it was not considered for referenced values (OpenMP outlining and PollyACC). Fix by removing special handling for isConstCall entirely.	2021-09-26 03:26:43 -05:00
Michael Kruse	a5d47b3fa0	[Polly] Fix wrong redirect in test case.	2021-09-24 14:53:00 -05:00
Michael Kruse	e470f9268a	[Polly] Implement user-directed loop distribution/fission. This is a simple version without the possibility to define distribute points or followup-transformations. However, it is the first transformation that has to check whether the transformation is correct. It interprets the same metadata as the LoopDistribute pass. Re-apply after revert in `c7bcd72a38` with fix: Take isBand out of #ifndef NDEBUG since it now is used unconditionally.	2021-09-23 21:11:01 -05:00
Petr Hosek	c7bcd72a38	Revert "[Polly] Implement user-directed loop distribution/fission." This reverts commit `52c30adc7d` which breaks the build when NDEBUG is defined.	2021-09-23 14:04:25 -07:00
Michael Kruse	07e7cb9433	[Polly] Remove -polly-opt-fusion option. The name of the option is misleading and has been renamed by isl to "serialize-sccs". Instead of also renaming the option, remove it. The option is still accessible using -polly-isl-arg=--no-schedule-serialize-sccs	2021-09-23 15:43:08 -05:00
Michael Kruse	35f7020098	[Polly] Dissolve Isl test directory. NFC. All tests use ISL, integrate its subfolder into the components they belong to.	2021-09-22 17:45:07 -05:00
Michael Kruse	52c30adc7d	[Polly] Implement user-directed loop distribution/fission. This is a simple version without the possibility to define distribute points or followup-transformations. However, it is the first transformation that has to check whether the transformation is correct. It interprets the same metadata as the LoopDistribute pass.	2021-09-22 17:28:25 -05:00
Michael Kruse	ced20c6672	[Polly] Add -polly-reschedule and -polly-postopts options. This command line options allow to off parts of the schedule tree optimization pipeline.	2021-09-22 00:18:19 -05:00
Michael Kruse	cad9f98a2a	[Polly] Don't generate inter-iteration noalias metadata. This metadata was intended to mark all accesses within an iteration to be pairwise non-aliasing, in this case because every memory of a base pointer is touched (read or write) at most once. This is typical for 'sweeps' over all data. The stated motivation from D30606 is to ensure that unrolled iterations are considered non-aliasing. Rhe implemention had multiple issues: * The structure of the noalias metadata was malformed. D110026 added check in the verifier for this metadata, and the tests were failing since then. * This is not true for the outer loops of the BLIS matrix multiplication, where it was being inserted. Each element of A, B, C is accessed multiple times, as often as the loop not used as an index is iterating. * Scopes were added to SecondLevelOtherAliasScopeList (used for the !noalias scop list) on-the-fly when another SCEV was seen. This meant that previously visited instructions would not be updated with alias scopes that are only seen later, missing out those SCEVs they should not be aliasing with. * Since the !noalias scope list would ideally consists of all other SCEV for this base pointer, we might run quickly into scalability issues. Especially after unrolling there would probably at least once SCEV per instruction and unroll instance. * The inter-iteration noalias base pointer was not removed after leaving the loop marked with it, effectively marking everything after it to noalias as well. A solution I considered was to mark each instruction as non-aliasing with its own scope. The instruction itself would obviously alias itself, but such construction might also be considered invalid. Duplicating the instruction (e.g. due to speculation) would mark the instruction non-aliasing with its clone. I don't want to go into this territory, especially since the original motivation of determining unrolled instances as noalias based on SCEV is the what scev-aa does as well. This effectively reverts D30606 and D35761.	2021-09-20 22:20:17 -05:00
Nikita Popov	53720f74e4	[Polly] Partially fix scoped alias metadata This partially addresses the verifier failures caused by D110026. In particular, it does not fix the "second level" alias metadata.	2021-09-20 22:51:31 +02:00
Nikita Popov	0fc624f029	[IR] Return AAMDNodes from Instruction::getMetadata() (NFC) getMetadata() currently uses a weird API where it populates a structure passed to it, and optionally merges into it. Instead, we can return the AAMDNodes and provide a separate merge() API. This makes usages more compact. Differential Revision: https://reviews.llvm.org/D109852	2021-09-16 21:06:57 +02:00
Michael Kruse	658eb9e142	[Polly] Remove autotools build systems from Externals. NFC. Building a source distribution using autotools adds GPL-licenced files into the the sources. Although redistribution of theses files is explicitly allowed with an exception, these are not used by Polly which uses a CMake replacement. Use the direct source checkout instead (replacing the output of 'make dist'). Some m4 scripts with the same licence are also included in isl/ppcg repository. Removing them renders the autotools-based build scipts inoperable, so remove the autotools build system altogether.	2021-09-15 17:11:15 -05:00
Leonard Chan	9da62d3ed9	[polly] Fix "no member named 'getIndexExpressionsFromGEP'" As of 741fabc222f226d34d806056b804244b012853b, polly builders are failing from this error. The signiature is slightly different and accepts a ScalarEvolution reference instead. This should fix the polly builders.	2021-09-08 20:04:56 -07:00
Michael Kruse	8ae6933881	[Polly] Compile fix after Delinearization move. by commit `585c594d74`	2021-09-08 15:30:19 -05:00
Michael Kruse	c62d9a5ca0	[Polly] Use subtyped isl::schedule_nodes for ScheduleTreeVisitor. NFC. Change pass-by-const-ref to pass-by-value as objects are recreated due to custom up-/down-casting anwyway.	2021-08-31 20:54:12 -05:00
Michael Kruse	c6913905d1	[Polly] Mention correct flag in debug output. NFCI.	2021-08-31 20:54:12 -05:00
Michael Kruse	ca5f05d2df	[Polly][test] Add dependency to count. Polly does not use the count program itself, but somewhere in lit it is expected to exists. Otherwise, the following error occurs: llvm-lit: llvm-project/llvm/utils/lit/lit/llvm/subst.py:133: fatal: Did not find count in ./bin	2021-08-28 22:50:07 -05:00
Michael Kruse	ffa39b4582	[Polly] Fix dumpfunction.ll test.	2021-08-28 22:43:07 -05:00
Michael Kruse	e4f3f2c0c5	[Polly] Don't prune non-external function itself from dump.	2021-08-28 17:06:53 -05:00
Sylvestre Ledru	c22bd391bc	polly: remove the old reference to svn in the doc	2021-08-27 10:46:50 +02:00
Michael Kruse	1537563104	[Polly][test] Add missing %loadPolly. This fixes check-polly when using the -load mechanism, i.e. LLVM_POLLY_LINK_INTO_TOOLS=OFF.	2021-08-24 13:47:25 -05:00
Michael Kruse	cdbc86dd22	[Polly] Don't redundantly link libPolly into unittests. With LLVM_LINK_LLVM_DYLIB and LLVM_POLLY_LINK_INTO_TOOLS, Polly is already linked into libLLVM.so, linking libPolly.a as well into unittests results in duplicate command line registration errors.	2021-08-24 03:07:30 -05:00
Michael Kruse	955b91c19c	[Polly] Never consider non-SCoP blocks as error blocks. Code outside the SCoP will be executed recardless of the code versioning runtime check introduced by CodeGeneration. Assumption made based on that these are never executed in Polly-optimized code does not hold. This fixes the miscompilation of MultiSource/Applications/lambda-0.1.3	2021-08-23 01:04:01 -05:00
Michael Kruse	9cfab5e249	[Polly] Add support for -polly-dump-before/after with NPM. The new pass manager does not allow adding module passes at the -polly-position=before-vectorizer extension point. Introduce a DumpFunctionPass that dumps only current function. In contrast to the legacy pass manager's -polly-dump-before, each function will be dumped into its own file. -polly-dump-before-file is still not supported. The DumpFunctionPass uses llvm::CloneModule to copy the current function into a new module and then write it into a file.	2021-08-22 20:43:35 -05:00
Michael Kruse	58e4e71fc8	[Polly] Introduce caching for the isErrorBlock function. NFC. Compilation of the file insn-attrtab.c of the SPEC CPU 2017 502.gcc_r benchmark takes excessive time (> 30min) with Polly enabled. Most time is spent in the isErrorBlock function querying the DominatorTree. The isErrorBlock is invoked redundantly over the course of ScopDetection and ScopBuilder. This patch introduces a caching mechanism for its result. Instead of a free function, isErrorBlock is moved to ScopDetection where its cache map resides. This also means that many functions directly or indirectly calling isErrorBlock are not "const" anymore. The DetectionContextMap was marked as "mutable", but IMHO it never should have been since it stores the detection result. 502.gcc_r only takes excessive time with the new pass manager. The reason seeams to be that it invalidates the ScopDetection analysis more often than the legacy pass manager, for unknown reasons.	2021-08-18 14:05:50 -05:00
Michael Kruse	e8c8407aca	[Polly] Break early when the result is known. NFC.	2021-08-18 12:41:04 -05:00
Michael Kruse	0f1e67fac2	[Polly] Fix possibly infinite loop. The loop had no side-effect since first committed in `642594ae87`. While it is obvious what was intended, the code seems to never trigger.	2021-08-17 10:43:04 -05:00
Riccardo Mori	ce8272afb3	[Polly][Isl] Use isl::val::sub instead of isl::val::sub_ui. NFC This is part of an effort to reduce the differences between the custom C++ bindings used right now by polly in `lib/External/isl/include/isl/isl-noxceptions.h` and the official isl C++ interface. Changes made: - Use `isl::val::sub` instead of `isl::val::sub_ui` - `isl-noexceptions.h` has been generated by `355e84163a` Depends on D107225 Reviewed By: Meinersbur Differential Revision: https://reviews.llvm.org/D107293	2021-08-17 09:34:52 +02:00
Riccardo Mori	d3fdbda6b0	[Polly][Isl] Move to the new-polly-generator branch version of isl-noexceptions.h. NFCI This is part of an effort to reduce the differences between the custom C++ bindings used right now by polly in `lib/External/isl/include/isl/isl-noxceptions.h` and the official isl C++ interface. With this commit we are moving from the `polly-generator` branch to the `new-polly-generator` branch that is more mantainable and is based on the official C++ interface `cpp-checked.h`. Changes made: - There are now many sublcasses for `isl::ast_node` representing different isl types. Use `isl::ast_node_for`, `isl::ast_node_user`, `isl::ast_node_block` and `isl::ast_node_mark` where needed. - There are now many sublcasses for `isl::schedule_node` representing different isl types. Use `isl::schedule_node_mark`, `isl::schedule_node_extension`, `isl::schedule_node_band` and `isl::schedule_node_filter` where needed. - Replace the `isl::*::dump` with `dumpIslObj` since the isl dump method is not exposed in the C++ interface. - `isl::schedule_node::get_child` has been renamed to `isl::schedule_node::child` - `isl::pw_multi_aff::get_pw_aff` has been renamed to `isl::pw_multi_aff::at` - The constructor `isl::union_map(isl::union_pw_multi_aff)` has been replaced with the static method `isl::union_map::from()` - Replace usages of `isl::val::add_ui` with `isl::val::add` - `isl::union_set_list::alloc` is now a constructor - All the `isl_size` values are now wrapped inside the class `isl::size` use `isl::size::release` to get the internal `isl_size` value where needed. - `isl-noexceptions.h` has been generated by `73f5ed1f4d` No functional change intended. Reviewed By: Meinersbur Differential Revision: https://reviews.llvm.org/D107225	2021-08-16 15:53:26 +02:00
Michael Kruse	5eeaac22af	[Polly] Rename CodeGen -> generateCode. NFC. To conform to function naming convention: camelCase and start with a verb.	2021-08-13 12:46:07 -05:00
Michael Kruse	0232c1d10d	[Polly] Decompose object construction and detection algorithm. NFC. Avoid doing the detection work inside the constructor. In addition to polymorphism being unintuitive in constructors and other design problems such as if an exception is thrown, the ScopDetection class is usable without detection in the sense of "no Scop found" or "function skipped".	2021-08-13 12:44:37 -05:00
Michael Kruse	5a6d770651	[Polly] Fix compiler warnings. NFC.	2021-08-12 13:35:20 -05:00
Michael Kruse	9069082785	[Polly] Simplify domains before gist. The compilation of the file 526.blender_r/src/blender/source/blender/editors/space_logic/logic_ops.c from the SPEC CPU 2017 benchmarks took excessive time to compute InvalidDomain.gist_params(Ctx) Simplifying beforehand, specifically using isl_set_detect_equalities, reduces the computation time to a negible level again.	2021-08-12 08:48:14 -05:00
Michael Kruse	0f50ffb336	[Polly][test] Add tests for IslMaxOperationsGuard. Add unittests for IslMaxOperationsGuard and the behaviour of the isl-noexception.h wrapper under exceeded max_operations. Reviewed By: patacca Differential Revision: https://reviews.llvm.org/D107401	2021-08-05 14:52:39 -05:00
Michael Kruse	50eaa82cdb	[Polly][test] Test difference between isl::stat:ok() and isl::stat::error(). The foreach callback wrappers tests check the return values of isl::stat:ok() and isl::stat::error() separately. However, due to the the container they are iterating over containing just one element, they are actually not testing the difference between them. This patch changes to set to be iterated over to contain 2 element to make returning sl::stat:ok (continue iterating the next element) and isl::stat::error (break after current element) have different effects other than the return value of the foreach itself. Reviewed By: patacca Differential Revision: https://reviews.llvm.org/D107395	2021-08-05 14:49:31 -05:00
Eli Friedman	3f2828dc28	[polly] Fix up regression test config with current features. Primarily, configure substitutions so we can copy-paste the "RUN" line of failed tests without worrying about the paths.	2021-07-30 13:44:48 -07:00
Tarindu Jayatilaka	7a797b2902	Take OptimizationLevel class out of Pass Builder Pulled out the OptimizationLevel class from PassBuilder in order to be able to access it from within the PassManager and avoid include conflicts. Reviewed By: mtrofin Differential Revision: https://reviews.llvm.org/D107025	2021-07-29 21:57:23 -07:00
Tom Stellard	08c766a731	Bump the trunk major version to 14 and clear the release notes.	2021-07-27 21:58:25 -07:00

1 2 3 4 5 ...

4233 Commits