llvm-project

Commit Graph

Author	SHA1	Message	Date
Tobias Grosser	b5563c6817	Make sure that all parameter dimensions are set in schedule Summary: In case the option -polly-ignore-parameter-bounds is set, not all parameters will be added to context and domains. This is useful to keep the size of the sets and maps we work with small. Unfortunately, for AST generation it is necessary to ensure all parameters are part of the schedule tree. Hence, we modify the GPGPU code generation to make sure this is the case. To obtain the necessary information we expose a new function Scop::getFullParamSpace(). We also make a couple of functions const to be able to make SCoP::getFullParamSpace() const. Reviewers: Meinersbur, bollu, gareevroman, efriedma, huihuiz, sebpop, simbuerg Subscribers: nemanjai, kbarton, pollydev, llvm-commits Tags: #polly Differential Revision: https://reviews.llvm.org/D36243 llvm-svn: 309939	2017-08-03 13:51:15 +00:00
Tobias Grosser	7b45af13ce	Move setNewAccessRelation to isl++ llvm-svn: 309871	2017-08-02 19:27:25 +00:00
Tobias Grosser	6d58804cc2	Move ScopStmt::setAccessRelation to isl++ llvm-svn: 309870	2017-08-02 19:27:16 +00:00
Philip Pfaffe	33aef072c1	Fix r309826: Appease clang-format check. llvm-svn: 309853	2017-08-02 18:26:48 +00:00
Philip Pfaffe	8f1872fb27	Fix r309826: Move intantiation and specialization of OwningScopAnalysisManagerFunctionProxy to the polly namespace. When compiling with clang, explicit instantiation of the OwningScopAnalysisManagerFunctionProxy needs to happen within the polly namespace. Same goes with the specialization of its run method. llvm-svn: 309835	2017-08-02 17:25:45 +00:00
Philip Pfaffe	a70e2649ab	[Polly][PM][WIP] Polly pass registration Summary: This patch is a first attempt at registering Polly passes with the LLVM tools. Tool plugins are still unsupported, but this registration is usable from the tools if Polly is linked into them (albeit requiring minimal patches to those tools). Registration requires a small amount of machinery (the owning analysis proxies), necessary for injecting ScopAnalysisManager objects into the calling tools. This patch is marked WIP because the registration is incomplete. Parsing manual pipelines is fully supported, but default pass injection into the O3 pipeline is lacking, mostly because there is opportunity for some redesign here, I believe. The first point of order would be insertion points. I think it makes sense to run before the vectorizers. Running Polly Early, however, is weird. Mostly because it actually is the default (which to me is unexpected), and because Polly runs it's own O1 pipeline. Why not instead insert it at an appropriate place somewhere after simplification happend? Running after the loop optimizers seems intuitive, but it also seems wasteful, since multiple consecutive loops might well be a single scop, and we don't need to run for all of them. My second request for comments would be regarding all those smallish helper passes we have, like PollyViewer, PollyPrinter, PollyImportJScop. Right now these are controlled by command line options, deciding whether they should be part of the Polly pipeline. What is your opinion on treating them like real passes, and have the user write an appropriate pipeline if they want to use any of them? Reviewers: grosser, Meinersbur, bollu Reviewed By: grosser Subscribers: llvm-commits, pollydev Tags: #polly Differential Revision: https://reviews.llvm.org/D35458 llvm-svn: 309826	2017-08-02 15:52:25 +00:00
Philip Pfaffe	f081ec7609	[PM] Fix proxy invalidation Summary: I made a mistake in handling transitive invalidation of analysis results. I've updated the list of preserved analyses as well as the correct result dependences. The Invalidator passed through the invalidate() path can be used to transitively invalidate analyses. It frequently happens that analysis results depend on other analyses, and thus store references to their results. When the dependee now gets invalidated, the depender needs to be invalidated as well. This is the purpose of the Invalidator object, which can be used to check whether some dependee analysis is in the process of being invalidated. I originally was checking the wrong dependee analyses, which is an actual error, you can only check analysis results that are in the cache (which they are if you've captured their reference). The invalidation I'm handling inside the proxy deals with the standard analyses the proxy passes into the Scop pipeline, since I'm capturing their reference. This checking allows us to actually preserve a couple of results outside of the proxy, since the Scop pipeline shouldn't break those, or otherwise should update them accordingly. Reviewers: grosser, Meinersbur, bollu Reviewed By: grosser Subscribers: pollydev, llvm-commits Differential Revision: https://reviews.llvm.org/D36216 llvm-svn: 309811	2017-08-02 13:18:49 +00:00
Philip Pfaffe	ead67dbbd6	[SI][NewPM] Collect loop count statistics llvm-svn: 309807	2017-08-02 11:14:41 +00:00
Philip Pfaffe	f5a4394ad6	[SD] Set PollyUseRuntimeAliasChecks correctly llvm-svn: 309805	2017-08-02 11:08:01 +00:00
Siddharth Bhat	0a1177b58e	[ScopDetect] add `-polly-ignore-func` flag to ignore functions by name. Ignore all functions whose name match a regex. Useful because creating a regex that does not match a string is somewhat hard. Example: https://stackoverflow.com/questions/1240275/how-to-negate-specific-word-in-regex llvm-svn: 309377	2017-07-28 11:47:24 +00:00
Michael Kruse	a508a4e619	[ScopBuilder/Simplify] Refactor isEscaping. NFC. ScopBuilder and Simplify (through VirtualInstruction.cpp) previously used this functionality in their own implementation. Refactor them both into a common one into the Scop class. BlockGenerator also makes use of a similiar functionality, but also records outside users and takes place after region simplification. Merging it as well would be more complicated. llvm-svn: 309273	2017-07-27 14:39:52 +00:00
Michael Kruse	eca86cee64	[ScopInfo] Never print instruction list of region stmts. A region statement's instruction list is always empty and ignored by the code generator. Don't give the impression that it means anything. llvm-svn: 309197	2017-07-26 22:01:33 +00:00
Michael Kruse	1df1aac014	[ScopInfo] Avoid use of getStmtFor(BB). NFC. Since there will be no more a 1:1 correspondence between statements and basic blocks, we would like to get rid of the method getStmtFor(BB) and its uses. Here we remove one of its uses in ScopInfo by fetching the statement in which the call instruction lies. Contributed-by: Nandini Singhal <cs15mtech01004@iith.ac.in> Differential Revision: https://reviews.llvm.org/D35691 llvm-svn: 309110	2017-07-26 13:25:28 +00:00
Michael Kruse	8d89179e33	[ScopInfo] Rename ScopStmt::contains(BB) to represents(BB). NFC. In future, there will be no more a 1:1 correspondence between statements and basic blocks, the name `contains` does not correctly capture their relationship. A BB may infact comprise of multiple statements; hence we describe a statement 'representing' a basic block. Differential Revision: https://reviews.llvm.org/D35838 llvm-svn: 308982	2017-07-25 16:25:37 +00:00
Tobias Grosser	d7065e5df5	Move MemoryAccess::isStride* to isl++ llvm-svn: 308927	2017-07-24 20:50:22 +00:00
Tobias Grosser	b739cb42f5	Move MemoryAccess::InvalidDomain to isl++ llvm-svn: 308923	2017-07-24 20:30:34 +00:00
Tobias Grosser	cdf471baef	Move MemoryAccess::getPwAff to isl++ llvm-svn: 308895	2017-07-24 16:36:34 +00:00
Tobias Grosser	1f6ba7e238	Move MemoryAccess::MemoryAccess to isl++ llvm-svn: 308893	2017-07-24 16:22:32 +00:00
Tobias Grosser	206e9e3b3b	Move ScopArrayInfo::getFromAccessFunction and getFromId to isl++ llvm-svn: 308892	2017-07-24 16:22:27 +00:00
Michael Kruse	07e8c36dc7	[ForwardOpTree] Support read-only value uses. Read-only values (values defined before the SCoP) require special handing with -polly-analyze-read-only-scalars=true (which is the default). If active, each use of a value requires a read access. When a copied value uses a read-only value, we must also ensure that such a MemoryAccess is available or is created. Differential Revision: https://reviews.llvm.org/D35764 llvm-svn: 308876	2017-07-24 12:43:27 +00:00
Siddharth Bhat	e2699b572e	[Polly] [NFC] [ScopDetection] Make `polly-only-func` perform regex scop name match. Summary: - We were using `.count` in `StringRef`, which matches substrings. - We may want to use this for equality as well. - Generalise this, so allow regexes as a parameter to `polly-only-func`. Differential Revision: https://reviews.llvm.org/D35728 llvm-svn: 308875	2017-07-24 12:40:52 +00:00
Tobias Grosser	1959dbda75	Move MemoryAccess::get*ArrayId to isl++ llvm-svn: 308843	2017-07-23 04:08:59 +00:00
Tobias Grosser	3b196131b5	Move applyScheduleToAccessRelation to isl++ llvm-svn: 308842	2017-07-23 04:08:52 +00:00
Tobias Grosser	6a87036e0f	Move MemoryAccess::getAddressFunction to isl++ llvm-svn: 308841	2017-07-23 04:08:45 +00:00
Tobias Grosser	1515f6b937	Move MemoryAccess::NewAccessRelation to isl++ We also move related accessor functions llvm-svn: 308840	2017-07-23 04:08:38 +00:00
Tobias Grosser	22da5f087a	Move MemoryAccess::getOriginalAccessRelation to isl++ llvm-svn: 308839	2017-07-23 04:08:27 +00:00
Tobias Grosser	0c4c2eef75	Move MemoryAccess::AccessRelation to isl++ llvm-svn: 308838	2017-07-23 04:08:22 +00:00
Tobias Grosser	b6e7a85a6d	Move MemoryAccess::createBasicAccessMap to isl++ llvm-svn: 308837	2017-07-23 04:08:17 +00:00
Tobias Grosser	fe46c3ff3a	Move MemoryAccess::id to isl++ llvm-svn: 308836	2017-07-23 04:08:11 +00:00
Michael Kruse	e52ebd1ae4	[ScopInfo] Adapt indentation of instruction list printing. Change the indention of the last brace to align with the opening line. Before: Instructions { %val = fadd double %arg, 2.100000e+01 store double %val, double* %A } After: Instructions { %val = fadd double %arg, 2.100000e+01 store double %val, double* %A } llvm-svn: 308828	2017-07-22 16:44:39 +00:00
Tobias Grosser	77eef90f50	Move ScopArrayInfo to isl++ This moves the full ScopArrayInfo class to isl++ llvm-svn: 308801	2017-07-21 23:07:56 +00:00
Michael Kruse	e186013149	Annotate dump() functions with LLVM_DUMP_METHOD. NFC. llvm-svn: 308749	2017-07-21 15:54:13 +00:00
Michael Kruse	5d5184698d	[ScopInfo] Don't compile dump() functions into non-assert builds. NFC. This follows a convention used in LLVM. llvm-svn: 308748	2017-07-21 15:54:07 +00:00
Michael Kruse	cd4c977b8b	[ScopInfo] Print instructions in dump(). Print a statement's instruction on dump() regardless of -polly-print-instructions. dump() is supposed to be used in the debugger only and never in regression tests. While debugging, get all the information we have and we are not bound to break anything. For non-dump purposes of print, forward the setting of -polly-print-instructions as parameters. Some calls to print() had to be changed because the PollyPrintInstructions setting is only available in ScopInfo.cpp. In ScheduleOptimizer.cpp, dump() was used in regression tests. That's not what dump() is for. The print parameter "PrintInstructions" will also be useful for an explicit print SCoP pass in a future patch. llvm-svn: 308746	2017-07-21 15:35:53 +00:00
Tobias Grosser	1eeedf4829	[IslNodeBuilder] Relax complexity check in invariant loads and run it early When performing invariant load hoisting we check that invariant load expressions are not too complex. Up to this commit, we performed this check by counting the sum of dimensions in the access range as a very simple heuristic. This heuristic is a little too conservative, as it prevents hoisting for any scops with a very large number of parameters. Hence, we update the heuristic to only count existentially quantified dimensions and set dimensions. We expect this to still detect the problematic expressions in h264 because of which this check was originally introduced. For some unknown reason, this complexity check was originally committed in IslNodeBuilder. It really belongs in ScopInfo, as there is no point in optimizing a program which we could have known earlier cannot be code generated. The benefit of running the check early is that we can avoid to even hoist checks that are expensive to code generate as invariant loads. This can be seen in the changed tests, where we now indeed detect the scop, but just not invariant load hoist the complicated access. We also improve the formatting of the code, document it, and use isl++ to simplify expressions. llvm-svn: 308659	2017-07-20 19:55:19 +00:00
Michael Kruse	1ce6791e7e	[ScopInfo] Get a list of statements for a region node. NFC. When constructing a schedule true and there are multiple statements for a basic block, create a sequence node for these statements. Contributed-by: Nandini Singhal <cs15mtech01004@iith.ac.in> Differential Revision: https://reviews.llvm.org/D35679 llvm-svn: 308635	2017-07-20 17:18:58 +00:00
Michael Kruse	6eba4b1031	[ScopInfo] Remove dependency of Scop::getLastStmtFor(BB) on getStmtFor(BB). NFC. We are working towards removing uses of Scop::getStmtFor(BB). In this patch, we remove dependency of Scop::getLastStmtFor(BB) on getStmtFor(BB). To do so, we get the list of all statements corresponding to the BB and then fetch the last one. Contributed-by: Nandini Singhal <cs15mtech01004@iith.ac.in> Differential Revision: https://reviews.llvm.org/D35665 llvm-svn: 308633	2017-07-20 17:08:50 +00:00
Michael Kruse	3562f272cf	[ScopInfo] Use map for lookupPHIReadOf. NFC. Introduce previously missing PHIReads analogous the the already existing PHIWrites/ValueWrites/ValueReads maps. PHIReads was initially not required and the later introduced lookupPHIReadOf() used a linear search instead. With PHIReads, lookupPHIReadOf() can now also do a map lookup and remove any surprising performance/behaviour differences to lookupPHIWriteOf(), lookupValueWriteOf() and lookupValueReadOf(). llvm-svn: 308630	2017-07-20 16:47:57 +00:00
Michael Kruse	22058c3fbb	[Simplify] Remove unused instructions and accesses. Use a mark-and-sweep algorithm to find and remove unused instructions and MemoryAccesses. This is useful in particular to remove scalar writes that are never used anywhere. A scalar write in a loop induces a write-after-write dependency that stops the loop iterations to be rescheduled. Such writes can be a result of previous transformations such as DeLICM and operand tree forwarding. It adds a new class VirtualInstruction that represents an instruction in a particular statement. At the moment an instruction can only belong to the statement that represents a BasicBlock. In the future, instructions can be in one of multiple statements representing a BasicBlock (Nandini's work), in different statements than its BasicBlock would indicate, and even multiple statements at once (by forwarding operand trees). It also integrates nicely with the VirtualUse class. ScopStmt::contains(Instruction*) currently uses the instruction's parent BasicBlock to check whether it contains the instruction. It will need to check the actual statement list when one of the aforementioned features become possible. Differential Revision: https://reviews.llvm.org/D35656 llvm-svn: 308626	2017-07-20 16:21:55 +00:00
Michael Kruse	4642c3ce85	[ScopBuilder] Avoid use of getStmtFor(BB). NFC. Since there will be no more a 1:1 correspondence between statements and basic blocks, we would like to get rid of the method getStmtFor(BB) and its uses. Here we remove one of its uses in ScopBuilder by fetching the statement in which the instruction lies. Contributed-by: Nandini Singhal <cs15mtech01004@iith.ac.in> Differential Revision: https://reviews.llvm.org/D35610 llvm-svn: 308610	2017-07-20 12:47:09 +00:00
Michael Kruse	0865585eab	[ScopInfo] Add support for wrap-around of integers in unsigned comparisons. This is one possible solution to implement wrap-arounds for integers in unsigned icmp operations. For example, store i32 -1, i32* %A_addr %0 = load i32, i32* %A_addr %1 = icmp ult i32 %0, 0 %1 should hold false, because under the assumption of unsigned integers, -1 should wrap around to 2^32-1. However, previously. it was assumed that the MSB (Most Significant Bit - aka the Sign bit) was never set for integers in unsigned operations. This patch modifies the buildConditionSets function in ScopInfo.cpp to give better information about the integers in these unsigned comparisons. Contributed-by: Annanay Agarwal <cs14btech11001@iith.ac.in> Differential Revision: https://reviews.llvm.org/D35464 llvm-svn: 308608	2017-07-20 12:37:02 +00:00
Michael Kruse	8b8058072f	[ScopInfo] Integrate ScalarDefUseChain into polly::Scop. NFC. Before this patch, ScalarDefUseChain was a tool used by DeLICM to find all reads and writes of scalar accesses. It iterated once over all accesses and stores the accesses into maps. By integrating it into the Scop class, we can keep the maps up-to-date without the need for recomputing them. It will be needed for more than DeLICM in the future, such as SCoP simplification, code movement between virtual statements, and array expansion (GSoC project). Compared to ScalarUseDefChain, we save two maps by finding the ScopStmt a Def/PHIRead must reside in, and use its already existing lookup function to find the MemoryAccess. Differential Revision: https://reviews.llvm.org/D35631 llvm-svn: 308495	2017-07-19 17:11:25 +00:00
Tobias Grosser	199ec4af40	[ScopInfo] Do not create entries in map if non exists Suggested-by: Michael Kruse <llvm@meinersbur.de> llvm-svn: 308491	2017-07-19 16:31:10 +00:00
Michael Kruse	4dfa732750	[ScopInfo] Introduce list of statements in Scop::StmtMap. NFC. Once statements are split, a BasicBlock will comprise of multiple statements. To prepare for this change in future, we introduce a list of statements in the statement map. Contributed-by: Nandini Singhal <cs15mtech01004@iith.ac.in> Differential Revision: https://reviews.llvm.org/D35301 llvm-svn: 308318	2017-07-18 15:41:49 +00:00
Eli Friedman	e737fc120e	[Polly] [OptDiag] Updating Polly Diagnostics Remarks Utilizing newer LLVM diagnostic remark API in order to enable use of opt-viewer tool. Polly Diagnostic Remarks also now appear in YAML remark file. In this patch, I've added the OptimizationRemarkEmitter into certain classes where remarks are being emitted and update the remark emit calls itself. I also provide each remark a BasicBlock or Instruction from where it is being called, in order to compute the hotness of the remark. Patch by Tarun Rajendran! Differential Revision: https://reviews.llvm.org/D35399 llvm-svn: 308233	2017-07-17 23:58:33 +00:00
Tobias Grosser	66e38a84be	[Polly] Avoid use of `getStmtFor(BB)` in PolyhedralInfo. NFC Summary: Since there will be no more a 1-1 correspondence between statements and basic block, we would like to get rid of the method `getStmtFor(BB)` and its uses. Here we remove one of its uses in PolyhedralInfo, as suggested by Michael Sir. Reviewers: grosser, Meinersbur, bollu Reviewed By: grosser Subscribers: pollydev Tags: #polly Differential Revision: https://reviews.llvm.org/D35300 llvm-svn: 308220	2017-07-17 20:58:13 +00:00
Tobias Grosser	4556c9b8fe	[ScopInfo] Simplify new access functions under domain context Summary: We do not keep domain constraints on access functions when building the scop. Hence, for consistency reasons, it makes also sense to not include them when storing a new access function. This change results in simpler access functions that make output easier to read. This patch also helps to make DeLICMed memory accesses to be understood by our matrix multiplication pattern matching pass. Further changes to the matrix multiplication pattern matching are needed for this to work, so the corresponding test case will be added in a future commit. Reviewers: Meinersbur, bollu, gareevroman, efriedma, huihuiz, sebpop, simbuerg Subscribers: pollydev, llvm-commits Tags: #polly Differential Revision: https://reviews.llvm.org/D35237 llvm-svn: 308215	2017-07-17 20:47:10 +00:00
Tobias Grosser	21cbcf03d3	ScopInfo: Remove not-in-DomainMap statements in separate function This separates ScopBuilder internal and ScopBuilder external functionality. llvm-svn: 308152	2017-07-16 23:55:38 +00:00
Tobias Grosser	3012a0b302	Fix typo in comment [NFC] llvm-svn: 308149	2017-07-16 22:44:17 +00:00
Tobias Grosser	a3aa423fc3	[ScopDetection] If a loop is not part of a scop, none of it backedges can be This patch makes sure that in case a loop is not fully contained within a region that later forms a SCoP, none of the loop backedges are allowed to be part of the region. We currently do not support the situation where only some of a loops backedges are part of a scop. Today, this can break both scop modeling and code generation. One such breaking test case is for example test/ScopDetectionDiagnostics/loop_partially_in_scop-2.ll, where we totally forgot to code generate some of the backedges. Fortunately, it is commonly not necessary to support these partial loops, it is way more common that either no backedge is included in a region or all loop backedge are included. This fixes a recent miscompile in MultiSource/Benchmarks/MiBench/consumer-typeset which was exposed after r306477. llvm-svn: 308113	2017-07-15 22:42:17 +00:00
Tobias Grosser	325204a30e	[Polly] Translate Scop::DomainMap to islpp Reviewers: grosser, Meinersbur, bollu Subscribers: pollydev Tags: #polly Contributed-by: Nandini Singhal <cs15mtech01004@iith.ac.in> Differential Revision: https://reviews.llvm.org/D35453 llvm-svn: 308093	2017-07-15 12:41:32 +00:00
Tobias Grosser	13acbb91ee	[Polly] Use Isl c++ for InvalidDomainMap Reviewers: grosser, Meinersbur, bollu Subscribers: maxf, pollydev Tags: #polly Contributed-by: Nandini Singhal <cs15mtech01004@iith.ac.in> Differential Revision: https://reviews.llvm.org/D35308 llvm-svn: 308089	2017-07-15 09:01:31 +00:00
Siddharth Bhat	a1b2086a33	[Invariant Loads] Do not consider invariant loads to have dependences. We need to relax constraints on invariant loads so that they do not create fake RAW dependences. So, we do not consider invariant loads as scalar dependences in a region. During these changes, it turned out that we do not consider `llvm::Value` replacements correctly within `PPCGCodeGeneration` and `ISLNodeBuilder`. The replacements dictated by `ValueMap` were not being followed in all places. This was fixed in this commit. There is no clean way to decouple this change because this bug only seems to arise when the relaxed version of invariant load hoisting was enabled. Differential Revision: https://reviews.llvm.org/D35120 llvm-svn: 307907	2017-07-13 12:18:56 +00:00
Singapuram Sanjay Srivallabh	1abd9ffa37	[PPCGCodeGen] Differentiate kernels based on their parent Scop Summary: Add a sequence number that identifies a ptx_kernel's parent Scop within a function to it's name to differentiate it from other kernels produced from the same function, yet different Scops. Kernels produced from different Scops can end up having the same name. Consider a function with 2 Scops and each Scop being able to produce just one kernel. Both of these kernels have the name "kernel_0". This can lead to the wrong kernel being launched when the runtime picks a kernel from its cache based on the name alone. This patch supplements D33985, by differentiating kernels across Scops as well. Previously (even before D33985) while profiling kernels generated through JIT e.g. Julia, [[ https://groups.google.com/d/msg/polly-dev/J1j587H3-Qw/mR-jfL16BgAJ \| kernels associated with different functions, and even different SCoPs within a function, would be grouped together due to the common name ]]. This patch prevents this grouping and the kernels are reported separately. Reviewers: grosser, bollu Reviewed By: grosser Subscribers: mehdi_amini, nemanjai, pollydev, kbarton Tags: #polly Differential Revision: https://reviews.llvm.org/D35176 llvm-svn: 307814	2017-07-12 16:46:19 +00:00
Tobias Grosser	6a4c12fb33	Always export the latest memory access relations This allows us to export the results from transformations such as DeLICM. llvm-svn: 307641	2017-07-11 10:10:13 +00:00
Tobias Grosser	f44f005a7d	Remove freed InvalidDomains from InvalidDomainMap. Summary: Since r306667, propagateInvalidStmtDomains gets a reference to an InvalidDomainMap. As part of the branch leading to return false, the respective domain is freed. It is, however, not removed from the InvalidDomainMap, leaking a pointer to a freed object which results in a use-after-free. Fix this be removing the domain from the map before returning. We tried to derive a test case that reliably failes, but did not succeed in producing one. Hence, for now the failures in our LNT bots must be sufficient to keep this issue tested. Reviewers: grosser, Meinersbur, bollu Subscribers: bollu, nandini12396, pollydev, llvm-commits Differential Revision: https://reviews.llvm.org/D34971 llvm-svn: 307499	2017-07-09 15:47:17 +00:00
Siddharth Bhat	47c7237bd8	[NFC] [ScopInfo] fix warning about construction order llvm-svn: 307164	2017-07-05 15:07:28 +00:00
Singapuram Sanjay Srivallabh	02ca346e48	Introduce a hybrid target to generate code for either the GPU or CPU Summary: Introduce a "hybrid" `-polly-target` option to optimise code for either the GPU or CPU. When this target is selected, PPCGCodeGeneration will attempt first to optimise a Scop. If the Scop isn't modified, it is then sent to the passes that form the CPU pipeline, i.e. IslScheduleOptimizerPass, IslAstInfoWrapperPass and CodeGeneration. In case the Scop is modified, it is marked to be skipped by the subsequent CPU optimisation passes. Reviewers: grosser, Meinersbur, bollu Reviewed By: grosser Subscribers: kbarton, nemanjai, pollydev Tags: #polly Differential Revision: https://reviews.llvm.org/D34054 llvm-svn: 306863	2017-06-30 19:42:21 +00:00
Tobias Grosser	37c8ee7611	Fix typo llvm-svn: 306791	2017-06-30 06:30:51 +00:00
Michael Kruse	476f855ec8	[ScopInfo] Do not use ScopStmt in Domain derivation of ScopInfo. NFC ScopStmts were being used in the computation of the Domain of the SCoPs in ScopInfo. Once statements are split, there will not be a 1-to-1 correspondence between Stmts and Basic blocks. Thus this patch avoids the use of getStmtFor() by creating a map of BB to InvalidDomain and using it to compute the domain of the statements. Contributed-by: Nanidini Singhal <cs15mtech01004@iith.ac.in> Differential Revision: https://reviews.llvm.org/D33942 llvm-svn: 306667	2017-06-29 12:47:41 +00:00
Michael Kruse	b738ffa845	Heap allocation for new arrays. This patch aims to implement the option of allocating new arrays created by polly on heap instead of stack. To enable this option, a key named 'allocation' must be written in the imported json file with the value 'heap'. We need such a feature because in a next iteration, we will implement a mechanism of maximal static expansion which will need a way to allocate arrays on heap. Indeed, the expansion is very costly in terms of memory and doing the allocation on stack is not worth considering. The malloc and the free are added respectively at polly.start and polly.exiting such that there is no use-after-free (for instance in case of Scop in a loop) and such that all memory cells allocated with a malloc are free'd when we don't need them anymore. We also add : - In the class ScopArrayInfo, we add a boolean as member called IsOnHeap which represents the fact that the array in allocated on heap or not. - A new branch in the method allocateNewArrays in the ISLNodeBuilder for the case of heap allocation. allocateNewArrays now takes a BBPair containing polly.start and polly.exiting. allocateNewArrays takes this two blocks and add the malloc and free calls respectively to polly.start and polly.exiting. - As IntPtrTy for the malloc call, we use the DataLayout one. To do that, we have modified : - createScopArrayInfo and getOrCreateScopArrayInfo such that it returns a non-const SAI, in order to be able to call setIsOnHeap in the JSONImporter. - executeScopConditionnaly such that it return both start block and end block of the scop, because we need this two blocs to be able to add the malloc and the free calls at the right position. Differential Revision: https://reviews.llvm.org/D33688 llvm-svn: 306540	2017-06-28 13:02:43 +00:00
Tobias Grosser	1b9d1bcc6d	[ScopInfo] Bound the number of array disjuncts in run-time bounds checks This reduces the compilation time of one reduced test case from Android from 16 seconds to 100 mseconds (we bail out), without negatively impacting any other test case we currently have. We still saw occasionally compilation timeouts on the AOSP buildbot. Hopefully, those will go away with this change. llvm-svn: 306235	2017-06-25 06:32:00 +00:00
Michael Kruse	7604d9add5	[ScopBuilder] Pass ScopStmts around instead of BasicBlocks. NFC. During the construction of MemoryAccesses in ScopBuilder, BasicBlocks were used in function parameters, assuming that the ScopStmt an be directly derived from it. This won't be true anymore once we split BasicBlocks into multiple ScopStmt. As a preparation for such a change in the future, we instead pass the ScopStmt and avoid the use of getStmtFor(). There are two occasions where a kind of mapping from BasicBlock to ScopStmt is still required. 1. Get the statement representing the incoming block of a `PHINode` using `getLastStmtOf`. 2. One statement is required to write a scalar to be readable by those which need it. This is most often the statement which contains its definition, which we get using `getStmtFor(Instruction*)`. Differential Revision: https://reviews.llvm.org/D34369 llvm-svn: 306132	2017-06-23 17:55:36 +00:00
Tobias Grosser	78a7a6cddf	Bail out early in case we see an invalid runtime context in buildAliasGroups llvm-svn: 306088	2017-06-23 08:05:31 +00:00
Tobias Grosser	57a1d36d98	Hoist buildMinMaxAccess computeout to cover full alias-group This allows us to bail out both in case the lexmin/max computation is too expensive, but also in case the commulative cost across an alias group is too expensive. This is an improvement of r303404, which did not seem to be sufficient to keep the Android Buildbot quiet. llvm-svn: 306087	2017-06-23 08:05:27 +00:00
Tobias Grosser	8f23fb8486	[islpp] Move buildMinMaxAccess[es] to C++ [NFC] llvm-svn: 306086	2017-06-23 08:05:20 +00:00
Eli Friedman	5e589ea4b1	[ScopInfo] Fix crash with sum of invariant load and AddRec. r303971 added an assertion that SCEV addition involving an AddRec and a SCEVUnknown must involve a dominance relation: either the SCEVUnknown value dominates the AddRec's loop, or the AddRec's loop header dominates the SCEVUnknown. This is generally fine for most usage of SCEV because it isn't possible to write an expression in IR which would violate it, but it's a bit inconvenient here for polly. To solve the issue, just avoid creating a SCEV expression which triggers the asssertion. I'm not really happy with this solution, but I don't have any better ideas. Fixes https://bugs.llvm.org/show_bug.cgi?id=33464. Differential Revision: https://reviews.llvm.org/D34259 llvm-svn: 305864	2017-06-20 22:53:02 +00:00
Reid Kleckner	df2b283bf9	Fix -Wsign-compare in ScopInfo.cpp llvm::Loop::getNumBlocks returns an unsigned int, not a long. llvm-svn: 305717	2017-06-19 17:44:02 +00:00
Siddharth Bhat	286c916dde	[Polly] [ScopDetection] Allow passing multiple functions to `-polly-only-func`. - This is useful to run optimisations on only certain functions. Differential Revision: https://reviews.llvm.org/D33990 llvm-svn: 305060	2017-06-09 08:23:40 +00:00
Michael Kruse	a6d48f59a1	Fix a lot of typos. NFC. llvm-svn: 304974	2017-06-08 12:06:15 +00:00
Tobias Grosser	4071cb571a	[ScopInfo] Translate getNonHoistableCtx to C++ [NFC] llvm-svn: 304841	2017-06-06 23:13:02 +00:00
Siddharth Bhat	07bee290de	[CodeGen] Extend Performance Counter to track per-scop information. Previously, we would generate one performance counter for all scops. Now, we generate both the old information, as well as a per-scop performance counter to generate finer grained information. This patch needed a way to generate a unique name for a `Scop`. The start region, end region, and function name combined provides a unique `Scop` name. So, `Scop` has a new public API to provide its start and end region names. Differential Revision: https://reviews.llvm.org/D33723 llvm-svn: 304528	2017-06-02 08:01:22 +00:00
Michael Kruse	678aa336fa	[ScopBuilder] Exclude ignored intrinsics from explicit instruction list. Ignored intrinsics are ignored at code generation, therefore do not need to be part of the instruction list. Specifically, llvm.lifetime.* intrinisics are removed before code generation, referencing them would cause a use-after-free error. Contributed-by: Nandini Singhal <cs15mtech01004@iith.ac.in> Differential Revision: https://reviews.llvm.org/D33768 llvm-svn: 304483	2017-06-01 21:46:27 +00:00
Eli Friedman	de1b318dad	Add opt-bisect support to polly. This is useful for debugging miscompiles and extracting testcases for crashes. See http://llvm.org/docs/OptBisect.html . Differential Revision: https://reviews.llvm.org/D33752 llvm-svn: 304480	2017-06-01 21:29:05 +00:00
Tobias Grosser	dff902fca7	[ScopInfo] Do not lookup key twice [NFC] Suggested-by: Michael Kruse <llvm@meinersbur.de> llvm-svn: 304410	2017-06-01 12:46:51 +00:00
Michael Kruse	ed0c2f7e90	[ScopInfo] Do not add terminator & synthesizable instructions to the output instructions. Such instructions are generates on-demand by the CodeGenerator and thus do not need representation in a statement. Differential Revision: https://reviews.llvm.org/D33642 llvm-svn: 304151	2017-05-29 12:27:38 +00:00
Siddharth Bhat	8bb436eb26	Revert "[NFC] Fix formatting & typecast issue. Build succeeds." Should not have 'fixed' the formatting issue, I did not have the most recent version of `clang-format`. This reverts commit 761b1268359e14e59142f253d77864a29d55c56c. llvm-svn: 304148	2017-05-29 11:34:29 +00:00
Siddharth Bhat	ede801ca2b	[NFC] Fix formatting & typecast issue. Build succeeds. - Fix formatting in `RegisterPasses.cpp`. - `assert` tried to compare `isl::boolean` against `long`. Explicitly construct `bool` from `isl::boolean`. This allows the implicit cast of `bool` to `long. llvm-svn: 304146	2017-05-29 11:00:31 +00:00
Tobias Grosser	1e55db30d5	Delinearize memory accesses that reference parameters coming from function calls Certain affine memory accesses which we model today might contain products of parameters which we might combined into a new parameter to be able to create an affine expression that represents these memory accesses. Especially in the context of OpenCL, this approach looses information as memory accesses such as A[get_global_id(0) * N + get_global_id(1)] are assumed to be linear. We correctly recover their multi-dimensional structure by assuming that parameters that are the result of a function call at IR level likely are not parameters, but indeed induction variables. The resulting access is now A[get_global_id(0)][get_global_id(1)] for an array A[][N]. llvm-svn: 304075	2017-05-27 15:18:53 +00:00
Tobias Grosser	f5e7e60bc8	Allow side-effect free function calls in valid affine SCEVs Side-effect free function calls with only constant parameters can be easily re-generated and consequently do not prevent us from modeling a SCEV. This change allows array subscripts to reference function calls such as 'get_global_id()' as used in OpenCL. We use the function name plus the constant operands to name the parameter. This is possible as the function name is required and is not dropped in release builds the same way names of llvm::Values are dropped. We also provide more readable names for common OpenCL functions, to make it easy to understand the polyhedral model we generate. llvm-svn: 304074	2017-05-27 15:18:46 +00:00
Tobias Grosser	d5fcbef8ee	[Polly] Added the list of Instructions to output in ScopInfo pass Summary: This patch outputs all the list of instructions in BlockStmts. Reviewers: Meinersbur, grosser, bollu Subscribers: bollu, llvm-commits, pollydev Differential Revision: https://reviews.llvm.org/D33163 llvm-svn: 304062	2017-05-27 04:40:18 +00:00
Tobias Grosser	9932086895	[ScopInfo] Translate mapToDimension to isl C++ [NFC] llvm-svn: 304007	2017-05-26 17:22:03 +00:00
Tobias Grosser	c8d13f50cc	[ScopInfo] Tighten compute out introduced in r303404 It seems we are still spending too much time on rare inputs, which continue to timeout the AOSP buildbot. Let's see if a further reduction is sufficient. llvm-svn: 303807	2017-05-24 21:24:04 +00:00
Philip Pfaffe	1a0128faaa	[Polly] Add handling of Top Level Regions Summary: My goal is to make the newly added `AllowWholeFunctions` options more usable/powerful. The changes to ScopBuilder.cpp are exclusively checks to prevent `Region.getExit()` from being dereferenced, since Top Level Regions (TLRs) don't have an exit block. In ScopDetection's `isValidCFG`, I removed a check that disallowed ReturnInstructions to have return values. This might of course have been intentional, so I would welcome your feedback on this and maybe a small explanation why return values are forbidden. Maybe it can be done but needs more changes elsewhere? The remaining changes in ScopDetection are simply to consider the AllowWholeFunctions option in more places, i.e. allow TLRs when it is set and once again avoid derefererncing `getExit()` if it doesn't exist. Finally, in ScopHelper.cpp I extended `polly::isErrorBlock` to handle regions without exit blocks as well: The original check was if a given BasicBlock dominates all predecessors of the exit block. Therefore I do the same for TLRs by regarding all BasicBlocks terminating with a ReturnInst as predecessors of a "virtual" function exit block. Patch by: Lukas Boehm Reviewers: philip.pfaffe, grosser, Meinersbur Reviewed By: grosser Subscribers: pollydev, llvm-commits, bollu Tags: #polly Differential Revision: https://reviews.llvm.org/D33411 llvm-svn: 303790	2017-05-24 18:39:39 +00:00
Philip Pfaffe	24a1bb2cf9	Post-commit fix of a comment llvm-svn: 303628	2017-05-23 11:25:05 +00:00
Philip Pfaffe	3ef36fa222	[Polly][NewPM] Port DependenceInfo to the new ScopPassManager. Summary: This patch ports DependenceInfo to the new ScopPassManager. Printing is implemented as a seperate printer pass. Reviewers: grosser, Meinersbur Reviewed By: grosser Subscribers: llvm-commits, pollydev Tags: #polly Differential Revision: https://reviews.llvm.org/D33421 llvm-svn: 303621	2017-05-23 10:09:06 +00:00
Tobias Grosser	a32de1341e	[ScopInfo] Translate foldAccessRelation to isl C++ [NFC] llvm-svn: 303615	2017-05-23 07:22:56 +00:00
Tobias Grosser	53fc355e7d	[ScopInfo] Translate buildMemIntrinsicAccessRelation to isl C++ [NFC] llvm-svn: 303612	2017-05-23 07:07:09 +00:00
Tobias Grosser	1e2edaf3ea	[ScopInfo] Translate assumeNoOutOfBound to isl C++ [NFC] llvm-svn: 303611	2017-05-23 07:07:07 +00:00
Tobias Grosser	b1ed3d9749	[ScopInfo] Translate applyAndSetFAD to isl C++ llvm-svn: 303610	2017-05-23 07:07:05 +00:00
Tobias Grosser	2ade986c9e	[ScopInfo] Translate isReadOnly to isl C++ llvm-svn: 303608	2017-05-23 06:41:04 +00:00
Tobias Grosser	6d459c5d3d	[ScopInfo] Simplify domains early This speeds up scop modeling for scops with many redundent existentially quantified constraints. For the attached test case, this change reduces scop modeling time from minutes (hours?) to 0.15 seconds. This change resolves a compilation timeout on the AOSP build. Thanks Eli for reporting _and_ reducing the test case! Reported-by: Eli Friedman <efriedma@codeaurora.org> llvm-svn: 303600	2017-05-23 04:26:28 +00:00
Michael Kruse	706f79ab14	[CodeGen] Support partial write accesses. Allow the BlockGenerator to generate memory writes that are not defined over the complete statement domain, but only over a subset of it. It generates a condition that evaluates to 1 if executing the subdomain, and only then execute the access. Only write accesses are supported. Read accesses would require a PHINode which has a value if the access is not executed. Partial write makes DeLICM able to apply mappings that are not defined over the entire domain (for instance, a branch that leaves a loop with a PHINode in its header; a MemoryKind::PHI write when leaving is never read by its PHI read). Differential Revision: https://reviews.llvm.org/D33255 llvm-svn: 303517	2017-05-21 22:46:57 +00:00
Tobias Grosser	7be8245a40	[ScopInfo] Translate updateDimensionality to isl C++ [NFC] llvm-svn: 303514	2017-05-21 20:38:33 +00:00
Tobias Grosser	3137f2cb65	[ScopInfo] Translate wrapConstantDimensions to isl C++ [NFC] llvm-svn: 303511	2017-05-21 20:23:23 +00:00
Tobias Grosser	99ea1d0808	[ScopInfo] Translate addRangeBoundsToSet to isl C++ [NFC] llvm-svn: 303510	2017-05-21 20:23:20 +00:00
Siddharth Bhat	b7f68b8c9e	[Fortran Support] Materialize outermost dimension for Fortran array. - We use the outermost dimension of arrays since we need this information to generate GPU transfers. - In general, if we do not know the outermost dimension of the array (because the indexing expression is non-affine, for example) then we simply cannot generate transfer code. - However, for Fortran arrays, we can use the Fortran array representation which stores the dimensions of all arrays. - This patch uses the Fortran array representation to generate code that computes the outermost dimension size. Differential Revision: https://reviews.llvm.org/D32967 llvm-svn: 303429	2017-05-19 15:07:45 +00:00
Tobias Grosser	d8945baa0a	[ScopDetection] Allow detection of full functions This is useful when only analyzing functions. llvm-svn: 303420	2017-05-19 12:13:02 +00:00
Tobias Grosser	977158488e	[ScopInfo] Fix typo in documentation llvm-svn: 303405	2017-05-19 04:01:52 +00:00
Tobias Grosser	45e9fd1810	[ScopInfo] Gracefully handle long compile times The following test case tried to compute the lexicographic minimum of the following set during alias analysis, which caused very long compile time: [p_0, p_1, p_2, p_3, p_4, p_5] -> { MemRef0[i0] : (517p_3 >= 70944 - 298p_2 and 256i0 >= -71199 + 298p_2 + 517p_3 and 256i0 <= -70944 + 298p_2 + 517p_3) or (409p_4 >= 57120 - 298p_2 and 256i0 >= -57375 + 298p_2 + 409p_4 and 256i0 <= -57120 + 298p_2 + 409p_4) or (104p_4 >= 17329 + 149p_2 - 50p_3 and 128i0 >= 17328 + 149p_2 - 50p_3 - 104p_4 and 128i0 <= 17455 + 149p_2 - 50p_3 - 104p_4) or (104p_4 <= 17328 + 149p_2 - 50p_3 and 128i0 >= 17201 + 149p_2 - 50p_3 - 104p_4 and 128i0 <= 17328 + 149p_2 - 50p_3 - 104p_4) or (409p_4 <= 57119 - 298p_2 and 256i0 >= -57120 + 298p_2 + 409p_4 and 256i0 <= -56865 + 298p_2 + 409p_4) or (517p_3 <= 70943 - 298p_2 and 256i0 >= -70944 + 298p_2 + 517p_3 and 256i0 <= -70689 + 298p_2 + 517p_3) or (p_1 >= 2 + 2p_0 and 298p_5 >= 70944 - 517p_3 and 256i0 >= -71199 + 517p_3 + 298p_5 and 256i0 <= -70944 + 517p_3 + 298p_5) or (p_1 >= 2 + 2p_0 and 298p_5 >= 57120 - 409p_4 and 256i0 >= -57375 + 409p_4 + 298p_5 >and 256i0 <= -57120 + 409p_4 + 298p_5) or (p_1 >= 2 + 2p_0 and 149p_5 <= -17329 >+ 50p_3 + 104p_4 and 128i0 >= 17328 - 50p_3 - 104p_4 + 149p_5 and 128i0 <= >17455 - 50p_3 - 104p_4 + 149p_5) or (p_1 >= 2 + 2p_0 and 149p_5 >= -17328 + >50p_3 + 104p_4 and 128i0 >= 17201 - 50p_3 - 104p_4 + 149p_5 and 128i0 <= 17328 >- 50p_3 - 104p_4 + 149p_5) or (p_1 >= 2 + 2p_0 and 298p_5 <= 57119 - 409p_4 and >256i0 >= -57120 + 409p_4 + 298p_5 and 256i0 <= -56865 + 409p_4 + 298p_5) or >(p_1 >= 2 + 2p_0 and 298p_5 <= 70943 - 517p_3 and 256i0 >= -70944 + 517p_3 + >298p_5 and 256i0 <= -70689 + 517p_3 + 298p_5) } We now guard the potentially expensive functions in Polly's scop analysis to gracefully bail out in case of overly long compilation times. llvm-svn: 303404	2017-05-19 03:45:00 +00:00
Michael Kruse	960c0d0b04	[ScopInfo] Fix r302231 to use logical or (\|\|). NFC. In r302231 we mistakenly use bitwise or (\|) instead of logical or (\|\|). This patch fixes that. Contributed-by: Sameer AbuAsal <sabuasal@codeaurora.org> Differential Revision: https://reviews.llvm.org/D33337 llvm-svn: 303386	2017-05-18 21:55:36 +00:00
Siddharth Bhat	06e3c74d83	[Fortran Support] Change "global" pattern match to work for params Summary: - Rename global / local naming convention that did not make much sense to Visible / Invisible, where the visible refers to whether the ALLOCATE call to the Fortran array is present in the current module or not. - This match now works on both cross fortran module globals and on parameters to functions since neither of them are necessarily allocated at the point of their usage. - Add testcase that matches against both a load and a store against function parameters. Differential Revision: https://reviews.llvm.org/D33190 llvm-svn: 303356	2017-05-18 16:47:13 +00:00
Philip Pfaffe	35bdcaf9e9	[Polly][NewPM][WIP] Add a ScopPassManager This patch adds both a ScopAnalysisManager and a ScopPassManager. The ScopAnalysisManager is itself a Function-Analysis, and manages analyses on Scops. The ScopPassManager takes care of building Scop pass pipelines. This patch is marked WIP because I've left two FIXMEs which I need to think about some more. Both of these deal with invalidation: Deferred invalidation is currently not implemented. Deferred invalidation deals with analyses which cache references to other analysis results. If these results are invalidated, invalidation needs to be propagated into the caching analyses. The ScopPassManager as implemented assumes that ScopPasses do not affect other Scops in any way. There has been some discussion about this on other patch threads, however it makes sense to reiterate this for this specific patch. I'm uploading this patch even though it's incomplete to encourage discussion and give you an impression of how this is going to work. Differential Revision: https://reviews.llvm.org/D33192 llvm-svn: 303062	2017-05-15 13:43:01 +00:00
Philip Pfaffe	838e0884ef	[Polly][NewPM] Port ScopInfo to the new PassManager llvm-svn: 303056	2017-05-15 12:55:14 +00:00
Siddharth Bhat	0fe7231a2f	[Fortran Support] Add pattern match for Fortran Arrays that are parameters. - This breaks the previous assumption that Fortran Arrays are `GlobalValue`. - The names of functions were getting unwieldy. So, I renamed the Fortran related functions. Differential Revision: https://reviews.llvm.org/D33075 llvm-svn: 303040	2017-05-15 08:41:30 +00:00
Philip Pfaffe	5cc87e3ab3	[Polly][NewPM] Port ScopDetection to the new PassManager Summary: This is a proof of concept of how to port polly-passes to the new PassManager architecture. This approach works ootb for Function-Passes, but might not be directly applicable to Scop/Region-Passes. While we could just run the Analyses/Transforms over functions instead, we'd surrender the nice pipelining behaviour we have now. Reviewers: Meinersbur, grosser Reviewed By: grosser Subscribers: pollydev, sanjoy, nemanjai, llvm-commits Tags: #polly Differential Revision: https://reviews.llvm.org/D31459 llvm-svn: 302902	2017-05-12 14:37:29 +00:00
Michael Kruse	4c27643398	[DeLICM] Lookup input accesses. Previous to this patch, we used VirtualUse to determine the input access of an llvm::Value in a statement. The input access is the READ MemoryAccess that makes a value available in that statement, which can either be a READ of a MemoryKind::Value or the MemoryKind::PHI for a PHINode in the statement. DeLICM uses the input access to heuristically find a candidate to map without searching all possible values. This might modify the behaviour in that previously PHI accesses were not considered input accesses before. This was unintentially lost when "VirtualUse" was extracted from the "Known Knowledge" patch. llvm-svn: 302838	2017-05-11 22:56:46 +00:00
Michael Kruse	e60eca7316	[ScopInfo] Keep scalar acceess dictionaries up-to-data. NFC. When removing a MemoryAccess, also remove it from maps pointing to it. This was already done for InstructionToAccess, but not yet for ValueReads, ValueWrites and PHIWrites as those were only used during the ScopBuilder phase. Keeping them updated allows us to use them later as well. llvm-svn: 302836	2017-05-11 22:56:12 +00:00
Siddharth Bhat	c47f039efd	[Fix] [Fortran Support] Fix variable name & make testcase activate on release There was: #ifdef NDEBUG This should be: #ifndef NDEBUG Also, the variable name was incorrect. Fixed the variable name. llvm-svn: 302696	2017-05-10 17:27:48 +00:00
Siddharth Bhat	f2dbba8183	[Fortran Support] Detect Fortran arrays & metadata from dragonegg output Add the ability to tag certain memory accesses as those belonging to Fortran arrays. We do this by pattern matching against known patterns of Dragonegg's LLVM IR output from Fortran code. Fortran arrays have metadata stored with them in a struct. This struct is called the "Fortran array descriptor", and a reference to this is stored in each MemoryAccess. Differential Revision: https://reviews.llvm.org/D32639 llvm-svn: 302653	2017-05-10 13:11:20 +00:00
Tobias Grosser	f3adab4c20	[Polly] Canonicalize arrays according to base-ptr equivalence class Summary: In case two arrays share base pointers in the same invariant load equivalence class, we canonicalize all memory accesses to the first of these arrays (according to their order in the equivalence class). This enables us to optimize kernels such as boost::ublas by ensuring that different references to the C array are interpreted as accesses to the same array. Before this change the runtime alias check for ublas would fail, as it would assume models of the C array with differing (but identically valued) base pointers would reference distinct regions of memory whereas the referenced memory regions were indeed identical. As part of this change we remove most of the MemoryAccess::getBaseAddr interface. We removed already all references to getBaseAddr in previous commits to ensure that no code relies on matching base pointers between memory accesses and scop arrays -- except for three remaining uses where we need the original base pointer. We document for these situations that MemoryAccess::getOriginalBaseAddr may return a base pointer that is distinct to the base pointer of the scop array referenced by this memory access. Reviewers: sebpop, Meinersbur, zinob, gareevroman, pollydev, huihuiz, efriedma, jdoerfert Reviewed By: Meinersbur Subscribers: etherzhhb Tags: #polly Differential Revision: https://reviews.llvm.org/D28518 llvm-svn: 302636	2017-05-10 10:59:58 +00:00
Michael Kruse	5ae08c0ebb	[DeLICM] Known knowledge. Extend the Knowledge class to store information about the contents of array elements and which values are written. Two knowledges do not conflict the known content is the same. The content information if computed from writes to and loads from the array elements, and represented by "ValInst": isl spaces that compare equal if the value represented is the same. Differential Revision: https://reviews.llvm.org/D31247 llvm-svn: 302339	2017-05-06 14:03:58 +00:00
Michael Kruse	391a2ac09b	[ScopBuilder] Move Scop::init to ScopBuilder. NFC. Scop::init is used only during SCoP construction. Therefore ScopBuilder seems the more appropriate place for it. We integrate it onto its only caller ScopBuilder::buildScop where some other construction steps already took place. Differential Revision: https://reviews.llvm.org/D32908 llvm-svn: 302276	2017-05-05 20:09:08 +00:00
Michael Kruse	f1052ceb5e	[ScopBuilder] Do not verify unfeasible SCoPs. SCoPs with unfeasible runtime context are thrown away and therefore do not need their uses verified. The added test case requires a complexity limit to exceed. Normally, error statements are removed from the SCoP and for that reason are skipped during the verification. If there is a unfeasible runtime context (here: because of the complexity limit being reached), the removal of error statements and other SCoP construction steps are skipped to not waste time. Error statements are not modeled in SCoPs and therefore have no requirements on whether the scalars used in them are available. llvm-svn: 302234	2017-05-05 13:38:35 +00:00
Tobias Grosser	d5727c5011	Fix handling of signWrappedSets in access relations Since r294891, in MemoryAccess::computeBoundsOnAccessRelation(), we skip manually bounding the access relation in case the parameter of the load instruction is already a wrapped set. Later on we assume that the lower bound on the set is always smaller or equal to the upper bound on the set. Bug 32715 manages to construct a sign wrapped set, in which case the assertion does not necessarily hold. Fix this by handling a sign wrapped set similar to a normal wrapped set, that is skipping the computation. Contributed-by: Maximilian Falkenstein <falkensm@student.ethz.ch> Reviewers: grosser Subscribers: pollydev, llvm-commits Tags: #Polly Differential Revision: https://reviews.llvm.org/D32893 llvm-svn: 302231	2017-05-05 13:20:47 +00:00
Michael Kruse	704c03e03b	[ScopBuilder] Add missing semicolon after LLVM_FALLTHROUGH. It was forgotten in r302157. llvm-svn: 302163	2017-05-04 15:55:54 +00:00
Michael Kruse	eedae7630a	Introduce VirtualUse. NFC. If a ScopStmt references a (scalar) value, there are multiple possibilities where this value can come. The decision about what kind of use it is must be handled consistently at different places, which can be error-prone. VirtualUse is meant to centralize the handling of the different types of value uses. This patch makes ScopBuilder and CodeGeneration use VirtualUse. This already helps to show inconsistencies with the value handling. In order to keep this patch NFC, exceptions to the general rules are added. These might be fixed later if they turn to problems. Overall, this should result in fewer post-codegen IR-verification errors, but instead assertion failures in `getNewValue` that are closer to the actual error. Differential Revision: https://reviews.llvm.org/D32667 llvm-svn: 302157	2017-05-04 15:22:57 +00:00
Tobias Grosser	3f25a7e8ee	[ScopDetection] Check for already known required-invariant loads [NFC] For certain test cases we spent over 50% of the scop detection time in checking if a load is likely invariant. We can avoid most of these checks by testing early on if a load is expected to be invariant. Doing this reduces scop-detection time on a large benchmark from 52 seconds to just 25 seconds. No functional change is expected. llvm-svn: 302134	2017-05-04 10:16:20 +00:00
Tobias Grosser	e2ccc3fb33	[ScopInfo] Do not use LLVM names to identify statements, arrays, and parameters LLVM-IR names are commonly available in debug builds, but often not in release builds. Hence, using LLVM-IR names to identify statements or memory reference results makes the behavior of Polly depend on the compile mode. This is undesirable. Hence, we now just number the statements instead of using LLVM-IR names to identify them (this issue has previously been brought up by Zino Benaissa). However, as LLVM-IR names help in making test cases more readable, we add an option '-polly-use-llvm-names' to still use LLVM-IR names. This flag is by default set in the polly tests to make test cases more readable. This change reduces the time in ScopInfo from 32 seconds to 2 seconds for the following test case provided by Eli Friedman <efriedma@codeaurora.org> (already used in one of the previous commits): struct X { int x; }; void a(); #define SIG (int x, X y, X z) typedef void (fn)SIG; #define FN { for (int i = 0; i < x; ++i) { (y)[i].x += (*z)[i].x; } a(); } #define FN5 FN FN FN FN FN #define FN25 FN5 FN5 FN5 FN5 #define FN125 FN25 FN25 FN25 FN25 FN25 #define FN250 FN125 FN125 #define FN1250 FN250 FN250 FN250 FN250 FN250 void x SIG { FN1250 } For a larger benchmark I have on-hand (10000 loops), this reduces the time for running -polly-scops from 5 minutes to 4 minutes, a reduction by 20%. The reason for this large speedup is that our previous use of printAsOperand had a quadratic cost, as for each printed and unnamed operand the full function was scanned to find the instruction number that identifies the operand. We do not need to adjust the way memory reference ids are constructured, as they do not use LLVM values. Reviewed by: efriedma Tags: #polly Differential Revision: https://reviews.llvm.org/D32789 llvm-svn: 302072	2017-05-03 20:08:52 +00:00
Tobias Grosser	72684bbaf5	[ScopInfo] Remove code not needed anymore after r302004 llvm-svn: 302005	2017-05-03 08:02:32 +00:00
Tobias Grosser	8133128c17	[ScopInfo] Do not add array name into memory reference ids Before this change a memory reference identifier had the form: <STMT>_<ACCESSTYPE><ID>_<MEMREF>, e.g., Stmt_bb9_Write0_MemRef_tmp11 After this change, we use the format: <STMT>_<ACCESSTYPE><ID>, e.g., Stmt_bb9_Write0 The name of the array that is accessed through a memory reference is not necessary to uniquely identify a memory reference, but was only added to provide additional information for debugging. We drop this information now for the following two reasons: 1) This shortens the names and consequently improves readability 2) This removes a second location where we decide on the name of a scop array, leaving us only with the location where the actual scop array is created. Having after 2) only a single location to name scop arrays will allow us to change the naming convention of scop arrays more easily, which we will do in a future commit to reduce compilation time. llvm-svn: 302004	2017-05-03 07:57:35 +00:00
Tobias Grosser	c96c1d8c87	[ScopInfo] Consider only write-free dereferencable loads as invariant When we introduced in r297375 support for hoisting loads that are known to be dereferencable without any conditional guard, we forgot to keep the check to verify that no other write into the very same location exists. This change ensures now that dereferencable loads are allowed to access everything, but can only be hoisted in case no conflicting write exists. This resolves llvm.org/PR32778 Reported-by: Huihui Zhang <huihuiz@codeaurora.org> llvm-svn: 301582	2017-04-27 20:08:16 +00:00
Siddharth Bhat	729377f063	[Polly] [DependenceInfo] change WAR generation, Read will not block Read Earlier, the call to buildFlow was: WAR = buildFlow(Write, Read, MustWrite, Schedule). This meant that Read could block another Read, since must-sources can block each other. Fixed the call to buildFlow to correctly compute Read. The resulting code needs to do some ISL juggling to get the output we want. Bug report: https://bugs.llvm.org/show_bug.cgi?id=32623 Reviewers: Meinersbur Tags: #polly Differential Revision: https://reviews.llvm.org/D32011 llvm-svn: 301266	2017-04-24 22:23:12 +00:00
Tobias Grosser	7b5a4dfd46	Exploit BasicBlock::getModule to shorten code Suggested-by: Roman Gareev <gareevroman@gmail.com> llvm-svn: 299914	2017-04-11 04:59:13 +00:00
Siddharth Bhat	bcbfdade41	[Polly] [DependenceInfo] change WAR, WAW generation to correct semantics = Change of WAR, WAW generation: = - `buildFlow(Sink, MustSource, MaySource, Sink)` treates any flow of the form `sink <- may source <- must source` as a may dependence. - we used to call: ```lang=cpp, name=old-flow-call.cpp Flow = buildFlow(MustWrite, MustWrite, Read, Schedule); WAW = isl_union_flow_get_must_dependence(Flow); WAR = isl_union_flow_get_may_dependence(Flow); ``` - This caused some WAW dependences to be treated as WAR dependences. - Incorrect semantics. - Now, we call WAR and WAW correctly. == Correct WAW: == ```lang=cpp, name=new-waw-call.cpp Flow = buildFlow(Write, MustWrite, MayWrite, Schedule); WAW = isl_union_flow_get_may_dependence(Flow); isl_union_flow_free(Flow); ``` == Correct WAR: == ```lang=cpp, name=new-war-call.cpp Flow = buildFlow(Write, Read, MustaWrite, Schedule); WAR = isl_union_flow_get_must_dependence(Flow); isl_union_flow_free(Flow); ``` - We want the "shortest" WAR possible (exact dependences). - We mark all the must-writes as may-source, reads as must-souce. - Then, we ask for must dependence. - This removes all the reads that flow through a must-write before reaching a sink. - Note that we only block ealier writes with must-writes. This is intuitively correct, as we do not want may-writes to block must-writes. - Leaves us with direct (R -> W). - This affects reduction generation since RED is built using WAW and WAR. = New StrictWAW for Reductions: = - We used to call: ```lang=cpp,name=old-waw-war-call.cpp Flow = buildFlow(MustWrite, MustWrite, Read, Schedule); WAW = isl_union_flow_get_must_dependence(Flow); WAR = isl_union_flow_get_may_dependence(Flow); ``` - This is the right model of WAW we need for reductions, just not in general. - Reductions need to track only strict WAW, without any interfering reductions. = Explanation: Why the new WAR dependences in tests are correct: = - We no longer set WAR = WAR - WAW - Hence, we will have WAR dependences that were originally removed. - These may look incorrect, but in fact make sense. == Code: == ```lang=llvm, name=new-war-dependence.ll ; void manyreductions(long A) { ; for (long i = 0; i < 1024; i++) ; for (long j = 0; j < 1024; j++) ; S0: A += 42; ; ; for (long i = 0; i < 1024; i++) ; for (long j = 0; j < 1024; j++) ; S1: A += 42; ; ``` === WAR dependence: === { S0[1023, 1023] -> S1[0, 0] } - Between `S0[1023, 1023]` and `S1[0, 0]`, we will have the dependences: ```lang=cpp, name=dependence-incorrect, counterexample S0[1023, 1023]: -- tmp = A (load0)-- WAR 2 add = tmp + 42 \| -> A = add (store0) \| WAR 1 S1[0, 0]: \| tmp = A (load1) \| add = tmp + 42 \| A = add (store1)<- ``` - One may assume that WAR2 hides WAR1 (since store0 happens before store1). However, within a statement, Polly has no idea about the ordering of loads and stores. - Hence, according to Polly, the code may have looked like this: ```lang=cpp, name=dependence-correct S0[1023, 1023]: A = add (store0) tmp = A (load0) ---* add = A + 42 \| WAR 1 S1[0, 0]: \| tmp = A (load1) \| add = A + 42 \| A = add (store1) <-* ``` - So, Polly generates (correct) WAR dependences. It does not make sense to remove these dependences, since they are correct with respect to Polly's model. Reviewers: grosser, Meinersbur tags: #polly Differential revision: https://reviews.llvm.org/D31386 llvm-svn: 299429	2017-04-04 13:08:23 +00:00
Michael Kruse	6e7854a560	[ScopInfo] Fix typos in option description. llvm-svn: 299356	2017-04-03 12:03:38 +00:00
Huihui Zhang	d6d6a3f2ee	revert test commit r299024 llvm-svn: 299026	2017-03-29 20:23:56 +00:00
Huihui Zhang	9d19e9d232	test commit, add blank line llvm-svn: 299024	2017-03-29 20:10:45 +00:00
Siddharth Bhat	44b6cb4e63	[DependenceInfo] change name Write to MustWrite to remove ambiguity [NFC] "Write" is an overloaded term. In collectInfo() till buildFlow(), it is used to mean "must writes". However, within the memory based analysis, it is used to mean "both may and must writes". Renaming the Write variable helps clarify this difference. Reviewers: grosser Tags: #polly Differential Revision: https://reviews.llvm.org/D31181 llvm-svn: 298361	2017-03-21 11:54:08 +00:00
Michael Kruse	89b1f94e64	Revert "Remove references to AssumptionCache. NFC." The AssumptionCache removal of r289756 has been reverted in r290086/r290087. A different solution has been implemented in r291671 which keeps the AssumptionCache. We can therefore use it again in Polly. This reverts r289791. llvm-svn: 298089	2017-03-17 13:56:53 +00:00
Siddharth Bhat	4fe11cf95f	[DependenceInfo] Remove idempotent union: must-writes with may-writes [NFC] Since may-writes are always a superset of the must-writes, there is no point in taking a union of one with the other. llvm-svn: 298085	2017-03-17 13:26:10 +00:00
Michael Kruse	9b91c62e3a	[ScopInfo/PruneUnprofitable] Move default profitability check. In the previous default ScopInfo applied the profitability heuristic for scalar accesses (-polly-unprofitable-scalar-accs=true) and the -polly-prune-unprofitable was disabled by default (-polly-enable-prune-unprofitable=false) as that pruning was already done. This changes switches the defaults to -polly-unprofitable-scalar-accs=true -polly-enable-prune-unprofitable=false such that the scalar access heuristic check is done by the pass. This allows passes between ScopInfo and PruneUnprofitable to optimize away scalar accesses. Without enabling such intermediate passes, there is no change in behaviour of profitability checks in a PassManagerBuilder built pass chain, but it allows us to cover this configuration with the buildbots. Suggested-by: Tobias Grosser <tobias@grosser.es> llvm-svn: 298081	2017-03-17 13:10:05 +00:00
Michael Kruse	f3091bf4cf	[PruneUnprofitable] Add -polly-prune-unprofitable pass. ScopInfo's normal profitability heuristic considers SCoPs where all statements have scalar writes as not profitably optimizable and invalidate the SCoP in that case. However, -polly-delicm and -polly-simplify may be able to remove some of the scalar writes such that the flag -polly-unprofitable-scalar-accs=false allows disabling that part of the heuristic. In cases where DeLICM (or other passes after ScopInfo) are not successful in removing scalar writes, the SCoP is still not profitably optimizable. The schedule optimizer would again try computing another schedule, resulting in slower compilation. The -polly-prune-unprofitable pass applies the profitability heuristic again before the schedule optimizer Polly can still bail out even with -polly-unprofitable-scalar-accs=false. Differential Revision: https://reviews.llvm.org/D31033 llvm-svn: 298080	2017-03-17 13:09:52 +00:00
Tobias Grosser	5842dee251	[ScopInfo] Add option to not add parameter bounds to context [NFC] For experiments it is sometimes helpful to provide parameter bound information to polly and to not use these parameter bounds for simplification. Add a new option "-polly-ignore-parameter-bounds" which does precisely this. llvm-svn: 298077	2017-03-17 13:00:53 +00:00
Siddharth Bhat	db5dd14cbb	[DependenceInfo] Replace use of deprecated isl_dim_n_out [NFC] Change isl_dim_n_out to isl_map_dim(*, isl_dim_out) llvm-svn: 298075	2017-03-17 12:59:01 +00:00
Siddharth Bhat	65f3d5201e	[DependenceInfo] Track may-writes and build flow information in Dependences::calculateDependences. This ensures that we handle may-writes correctly when building dependence information. Also add a test case checking correctness of may-write information. Not handling it before was an oversight. Differential Revision: https://reviews.llvm.org/D31075 llvm-svn: 298074	2017-03-17 12:31:28 +00:00
Tobias Grosser	8a6e605e96	[ScopInfo] Do not take inbounds assumptions [NFC] For experiments it is sometimes helpful to not take any inbounds assumptions. Add a new option "-polly-ignore-inbounds" which does precisely this. llvm-svn: 298073	2017-03-17 12:26:58 +00:00
Tobias Grosser	b58ed8d3cd	[ScopInfo] Do not try to eliminate parameter dimensions that do not exist In subsequent changes we will make Polly a little bit more lazy in adding parameter dimensions to different sets. As a result, not all parameters will always be part of the parameter space. This change ensures that we do not use the '-1' returned when a parameter dimension cannot be found, but instead just do not try to eliminate the anyhow non-existing dimension. llvm-svn: 298054	2017-03-17 09:02:53 +00:00
Tobias Grosser	941cb7d979	[ScopInfo] Do not expand getDomains() to full parameter space. Since several years, isl can perform most operations on sets with differing parameter spaces, by expanding the parameter space on demand relying using named isl ids to distinguish different parameter dimensions. By not always expanding to full dimensionality the set remain smaller and can likely be operated on faster. This change by itself did not yet result in measurable performance benefits, but it is a step into the right direction needed to ensure that subsequent changes indeed can work with lower-dimensional sets and these sets do not get blown up by accident when later intersected with the domain context. llvm-svn: 298053	2017-03-17 09:02:50 +00:00
Tobias Grosser	f4fe34bfb8	Update to isl-0.18-387-g3fa6191 This is a normal / regular maintenance update. llvm-svn: 297999	2017-03-16 21:33:20 +00:00
Siddharth Bhat	65c4026992	Set Dependences::RED to be non-null once Dependences::calculateDependences() occurs, even if there is no actual reduction. This ensures correctness with isl operations. llvm-svn: 297981	2017-03-16 20:06:49 +00:00
Michael Kruse	5545407fa4	[ScopInfo] Introduce ScopStmt::getSurroundingLoop(). NFC. Introduce ScopStmt::getSurroundingLoop() to replace getFirstNonBoxedLoopFor. getSurroundingLoop() returns the precomputed surrounding/first non-boxed loop. Except in ScopDetection, the list of boxed loops is only used to get the surrounding loop. getFirstNonBoxedLoopFor also requires LoopInfo at every use which is not necessarily available everywhere where we may want to use it. Differential Revision: https://reviews.llvm.org/D30985 llvm-svn: 297899	2017-03-15 22:16:43 +00:00
Michael Kruse	0446d81e2d	[Simplify] Add -polly-simplify pass. This new pass removes unnecessary accesses and writes. It currently supports 2 simplifications, but more are planned. It removes write accesses that write a loaded value back to the location it was loaded from. It is a typical artifact from DeLICM. Removing it will get rid of bogus dependencies later in dependency analysis. It also removes statements without side-effects. ScopInfo already removes these, but the removal of unnecessary writes can result in more side-effect free statements. Differential Revision: https://reviews.llvm.org/D30820 llvm-svn: 297473	2017-03-10 16:05:24 +00:00
Tobias Grosser	8bd7f3c0a5	[ScopDetect/Info] Allow unconditional hoisting of loads from dereferenceable ptrs In case LLVM pointers are annotated with !dereferencable attributes/metadata or LLVM can look at the allocation from which a pointer is derived, we can know that dereferencing pointers is safe and can be done unconditionally. We use this information to proof certain pointers as save to hoist and then hoist them unconditionally. llvm-svn: 297375	2017-03-09 11:36:00 +00:00
Michael Kruse	6744efa8d8	[ScopDetection] Only allow SCoP-wide available base pointers. Simplify ScopDetection::isInvariant(). Essentially deny everything that is defined within the SCoP and is not load-hoisted. The previous understanding of "invariant" has a few holes: - Expressions without side-effects with only invariant arguments, but are defined withing the SCoP's region with the exception of selects and PHIs. These should be part of the index expression derived by ScalarEvolution and not of the base pointer. - Function calls with that are !mayHaveSideEffects() (typically functions with "readnone nounwind" attributes). An example is given below. @C = external global i32 declare float* @getNextBasePtr(float) readnone nounwind ... %ptr = call float @getNextBasePtr(float* %A, float %B) The call might return: * %A, so %ptr aliases with it in the SCoP * %B, so %ptr aliases with it in the SCoP * @C, so %ptr aliases with it in the SCoP * a new pointer everytime it is called, such as malloc() * a pointer into the allocated block of one of the aforementioned * any of the above, at random at each call Hence and contrast to a comment in the base_pointer.ll regression test, %ptr is not necessarily the same all the time. It might also alias with anything and no AliasAnalysis can tell otherwise if the definition is external. It is hence not suitable in the role of a base pointer. The practical problem with base pointers defined in SCoP statements is that it is not available globally in the SCoP. The statement instance must be executed first before the base pointer can be used. This is no problem if the base pointer is transferred as a scalar value between statements. Uses of MemoryAccess::setNewAccessRelation may add a use of the base pointer anywhere in the array. setNewAccessRelation is used by JSONImporter, DeLICM and D28518. Indeed, BlockGenerator currently assumes that base pointers are available globally and generates invalid code for new access relation (referring to the base pointer of the original code) if not, even if the base pointer would be available in the statement. This could be fixed with some added complexity and restrictions. The ExprBuilder must lookup the local BBMap and code that call setNewAccessRelation must check whether the base pointer is available first. The code would still be incorrect in the presence of aliasing. There is the switch -polly-ignore-aliasing to explicitly allow this, but it is hardly a justification for the additional complexity. It would still be mostly useless because in most cases either getNextBasePtr() has external linkage in which case the readnone nounwind attributes cannot be derived in the translation unit itself, or is defined in the same translation unit and gets inlined. Reviewed By: grosser Differential Revision: https://reviews.llvm.org/D30695 llvm-svn: 297281	2017-03-08 15:14:46 +00:00
Michael Kruse	5a4ec5c42b	[ScopDetection] Require LoadInst base pointers to be hoisted. Only when load-hoisted we can be sure the base pointer is invariant during the SCoP's execution. Most of the time it would be added to the required hoists for the alias checks anyway, except with -polly-ignore-aliasing, -polly-use-runtime-alias-checks=0 or if AliasAnalysis is already sure it doesn't alias with anything (for instance if there is no other pointer to alias with). Two more parts in Polly assume that this load-hoisting took place: - setNewAccessRelation() which contains an assert which tests this. - BlockGenerator which would use to the base ptr from the original code if not load-hoisted (if the access expression is regenerated) Differential Revision: https://reviews.llvm.org/D30694 llvm-svn: 297195	2017-03-07 20:28:43 +00:00
Tobias Grosser	ce69e7b593	[ScopInfo] Avoid infinite loop during schedule construction Our current scop modeling enters an infinite loop when trying to model code that has unreachable instructions (e.g., test/ScopInfo/BoundChecks/single-loop.ll), as the number of basic blocks returned by the LLVM Loop* does not include unreachable basic blocks that branch off from the core loop body. This arises for example in the following piece of code: for (i = 0; i < N; i++) { if (i > 1024) abort(); <- this abort might be translated to an unreachable A[i] = ... } This patch adds these unreachable basic blocks in our per loop basic block count to ensure that the schedule construction does not assume a loop has been processed completely, despite certain unreachable basic blocks still remaining. The infinite loop is only observable in combination with https://reviews.llvm.org/D12676 or a similar patch. llvm-svn: 297156	2017-03-07 16:17:55 +00:00
Tobias Grosser	134a572951	[ScopDetection] Do not detect scops that exit to an unreachable Scops that exit with an unreachable are today still permitted, but make little sense to optimize. We therefore can already skip them during scop detection. This speeds up scop detection in certain cases and also ensures that bugpoint does not introduce unreachables when reducing test cases. In practice this change should have little impact, as the performance of unreachable code is unlikely to matter. This commit is part of a series that makes Polly more robust in the presence of unreachables. llvm-svn: 297151	2017-03-07 15:50:43 +00:00
Tobias Grosser	1c787e0b49	[ScopDetection] Do not allow required-invariant loads in non-affine region These loads cannot be savely hoisted as the condition guarding the non-affine region cannot be duplicated to also protect the hoisted load later on. Today they are dropped in ScopInfo. By checking for this early, we do not even try to model them and possibly can still optimize smaller regions not containing this specific required-invariant load. llvm-svn: 296744	2017-03-02 12:15:37 +00:00
Tobias Grosser	c2f151084d	[ScopInfo] Disable memory folding in case it results in multi-disjunct relations Multi-disjunct access maps can easily result in inbound assumptions which explode in case of many memory accesses and many parameters. This change reduces compilation time of some larger kernel from over 15 minutes to less than 16 seconds. Interesting is the test case test/ScopInfo/multidim_param_in_subscript.ll which has a memory access [n] -> { Stmt_for_body3[i0, i1] -> MemRef_A[i0, -1 + n - i1] } which requires folding, but where only a single disjunct remains. We can still model this test case even when only using limited memory folding. For people only reading commit messages, here the comment that explains what memory folding is: To recover memory accesses with array size parameters in the subscript expression we post-process the delinearization results. We would normally recover from an access A[exp0(i) * N + exp1(i)] into an array A[][N] the 2D access A[exp0(i)][exp1(i)]. However, another valid delinearization is A[exp0(i) - 1][exp1(i) + N] which - depending on the range of exp1(i) - may be preferrable. Specifically, for cases where we know exp1(i) is negative, we want to choose the latter expression. As we commonly do not have any information about the range of exp1(i), we do not choose one of the two options, but instead create a piecewise access function that adds the (-1, N) offsets as soon as exp1(i) becomes negative. For a 2D array such an access function is created by applying the piecewise map: [i,j] -> [i, j] : j >= 0 [i,j] -> [i-1, j+N] : j < 0 After this patch we generate only the first case, except for situations where we can proove the first case to be invalid and can consequently select the second without introducing disjuncts. llvm-svn: 296679	2017-03-01 21:11:27 +00:00
Tobias Grosser	d7c4975349	[ScopInfo] Simplify inbounds assumptions under domain constraints Without this simplification for a loop nest: void foo(long n1_a, long n1_b, long n1_c, long n1_d, long p1_b, long p1_c, long p1_d, float A_1[][p1_b][p1_c][p1_d]) { for (long i = 0; i < n1_a; i++) for (long j = 0; j < n1_b; j++) for (long k = 0; k < n1_c; k++) for (long l = 0; l < n1_d; l++) A_1[i][j][k][l] += i + j + k + l; } the assumption: n1_a <= 0 or (n1_a > 0 and n1_b <= 0) or (n1_a > 0 and n1_b > 0 and n1_c <= 0) or (n1_a > 0 and n1_b > 0 and n1_c > 0 and n1_d <= 0) or (n1_a > 0 and n1_b > 0 and n1_c > 0 and n1_d > 0 and p1_b >= n1_b and p1_c >= n1_c and p1_d >= n1_d) is taken rather than the simpler assumption: p9_b >= n9_b and p9_c >= n9_c and p9_d >= n9_d. The former is less strict, as it allows arbitrary values of p1_* in case, the loop is not executed at all. However, in practice these precise constraints explode when combined across different accesses and loops. For now it seems to make more sense to take less precise, but more scalable constraints by default. In case we find a practical example where more precise constraints are needed, we can think about allowing such precise constraints in specific situations where they help. This change speeds up the new test case from taking very long (waited at least a minute, but it probably takes a lot more) to below a second. llvm-svn: 296456	2017-02-28 09:45:54 +00:00
Michael Kruse	2c7169d00c	[DependenceInfo] Remove unused variable. NFC. llvm-svn: 295987	2017-02-23 15:41:01 +00:00
Michael Kruse	dd6f29375b	[DependenceInfo] Use references instead of double pointers. NFC. Non-const references are the more C++-ish way to modify a variable passed by the caller. llvm-svn: 295986	2017-02-23 15:40:56 +00:00
Michael Kruse	ec8fc32160	[DependenceInfo] Rename StmtScheduleDomain -> TaggedStmtDomain. NFC. llvm-svn: 295985	2017-02-23 15:40:52 +00:00
Michael Kruse	00c38e0df2	[DependenceInfo] Simplify use of StmtSchedule's domain [NFC] Once a StmtSchedule is created, only its domain is used anywhere within DependenceInfo::calculateDependences. So, we choose to return the wrapped domain of the union_map rather than the entire union_map. However, we still build the union_map first within collectInfo(). It is cleaner to first build the entire union_map and then pull the domain out in one shot, rather than repeatedly extracting the domain in bits and pieces from accdom. Contributed-by: Siddharth Bhat <siddu.druid@gmail.com> Differential Revision: https://reviews.llvm.org/D30208 llvm-svn: 295984	2017-02-23 15:40:46 +00:00
Michael Kruse	52ab4943b4	Remove all references to PostDominators. NFC. Marking a pass as preserved is necessary if any Polly pass uses it, even if it is not preserved within the generated code. Not marking it would cause the the Polly pass chain to be interrupted. It is not used by any Polly pass anymore, hence we can remove all references to it. llvm-svn: 295983	2017-02-23 15:16:22 +00:00
Tobias Grosser	cc43087afc	[DependenceInfo] Simplify creation and subsequent use of AccessSchedule [NFC] We only ever use the wrapped domain of AccessSchedule, so stop creating an entire union_map and then pulling the domain out. Reviewers: grosser Tags: #polly Contributed-by: Siddharth Bhat <siddu.druid@gmail.com> Differential Revision: https://reviews.llvm.org/D30179 llvm-svn: 295726	2017-02-21 15:38:31 +00:00
Tobias Grosser	079d511891	[ScopInfo] Count read-only arrays when computing complexity of alias check Instead of counting the number of read-only accesses, we now count the number of distinct read-only array references when checking if a run-time alias check may be too complex. The run-time alias check is quadratic in the number of base pointers, not the number of accesses. Before this change we accidentally skipped SPEC's lbm test case. llvm-svn: 295567	2017-02-18 20:51:29 +00:00
Tobias Grosser	28492b85e2	[DependenceInfo] Pull out statement [NFC] This simplifies the code slightly. llvm-svn: 295551	2017-02-18 16:41:28 +00:00
Tobias Grosser	8ee46985d2	[Dependences] Compute reduction dependences on schedule tree [NFC] This change gets rid of the need for zero padding, makes the reduction computation code more similar to the normal dependence computation, and also better documents what we do at the moment. Making the dependence computation for reductions a little bit easier to understand will hopefully help us to further reduce code duplication. This reduces the time spent only in the reduction dependence pass from 260ms to 150ms for test/DependenceInfo/reduction_sequence.ll. This is a reduction of over 40% in dependence computation time. This change was inspired by discussions with Michael Kruse, Utpal Bora, Siddharth Bhat, and Johannes Doerfert. It can hopefully lay the base for further cleanups of the reduction code. llvm-svn: 295550	2017-02-18 16:39:04 +00:00
Tobias Grosser	2461021150	Drop leftover debug statement llvm-svn: 295444	2017-02-17 13:39:45 +00:00
Tobias Grosser	cd01a363d6	[ScopInfo] Add statistics to count loops after scop modeling llvm-svn: 295431	2017-02-17 08:12:36 +00:00
Tobias Grosser	65ce9362b8	[ScopDetection] Compute the maximal loop depth correctly Before this change, we obtained loop depth numbers that were deeper then the actual loop depth. llvm-svn: 295430	2017-02-17 08:08:54 +00:00
Tobias Grosser	ca2cfd0bd8	[ScopInfo] Do not try to fold array dimensions of size zero Trying to fold such kind of dimensions will result in a division by zero, which crashes the compiler. As such arrays are likely to invalidate the scop anyhow (but are not illegal in LLVM-IR), there is no point in trying to optimize the array layout. Hence, we just avoid the folding of constant dimensions of size zero. llvm-svn: 295415	2017-02-17 04:48:52 +00:00
Tobias Grosser	90411a967b	[ScopInfo] Rename MaxDisjunctions -> MaxDisjuncts [NFC] There is only a single disjunction. However, we bound the number of 'disjuncts' in this disjunction. Name the variable accordingly. llvm-svn: 295362	2017-02-16 19:11:33 +00:00
Tobias Grosser	c8a8276710	[ScopInfo] Bound the number of disjuncts in context Before this change wrapping range metadata resulted in exponential growth of the context, which made context construction of large scops very slow. Instead, we now just do not model the range information precisely, in case the number of disjuncts in the context has already reached a certain limit. llvm-svn: 295360	2017-02-16 19:11:25 +00:00
Tobias Grosser	98a3aa4f19	[ScopInfo] Use uppercase variable name [NFC] llvm-svn: 295350	2017-02-16 18:39:18 +00:00
Tobias Grosser	3281f601bb	[ScopInfo] Always derive upper and lower bounds for parameters Commit r230230 introduced the use of range metadata to derive bounds for parameters, instead of just looking at the type of the parameter. As part of this commit support for wrapping ranges was added, where the lower bound of a parameter is larger than the upper bound: { 255 < p \|\| p < 0 } However, at the same time, for wrapping ranges support for adding bounds given by the size of the containing type has acidentally been dropped. As a result, the range of the parameters was not guaranteed to be bounded any more. This change makes sure we always add the bounds given by the size of the type and then additionally add bounds based on signed wrapping, if available. For a parameter p with a type size of 32 bit, the valid range is then: { -2147483648 <= p <= 2147483647 and (255 < p or p < 0) } llvm-svn: 295349	2017-02-16 18:39:14 +00:00
Tobias Grosser	288c450cf6	[ScopDetectDiagnostics] Do not format unnamed array names Formatting unnamed array names is expensive in LLVM as the this requires deriving the numbered virtual instruction name (e.g., %12) for an llvm::Value, which is currently not implemented efficiently. As instruction numberes anyhow do not really carry a lot of information for the user, we just print 'unknown' instead. This change reduces the scop detection time from 24 to 19 seconds, for one of our large-scale inputs. This is a reduction by 21%. llvm-svn: 294894	2017-02-12 10:53:02 +00:00
Tobias Grosser	9fe37df27c	[ScopDetection] Add statistics to count the maximal number of scops in loop llvm-svn: 294893	2017-02-12 10:52:57 +00:00
Tobias Grosser	b3a85884f7	Do not use wrapping ranges to bound non-affine accesses When deriving the range of valid values of a scalar evolution expression might be a range [12, 8), where the upper bound is smaller than the lower bound and where the range is expected to possibly wrap around. We theoretically could model such a range as a union of two non-wrapping ranges, but do not do this as of yet. Instead, we just do not derive any bounds. Before this change, we could have obtained bounds where the maximal possible value is strictly smaller than the minimal possible value, which is incorrect and also caused assertions during scop modeling. llvm-svn: 294891	2017-02-12 08:11:12 +00:00
Tobias Grosser	296fe2e2ad	[ScopInfo] Use original base address when building ScopArrayInfo [NFC] This change clarfies that we want to indeed use the original base address when creating the ScopArrayInfo that corresponds to a given memory access. This change prepares for https://reviews.llvm.org/D28518. llvm-svn: 294734	2017-02-10 10:09:46 +00:00
Tobias Grosser	5db171a9da	[ScopInfo] Use getAccessValue to obtain the accessed value This replaces the use of getOriginalAddrPtr, a value that is stored in ScopArrayInfo and might at some point not be unique any more. However, the access value is defined to be unique. This change is an update on r294576, which only clarified that we need the original memory access, but where we still remained dependent to have one base pointer per scop. This change removes unnecessary uses of MemoryAddress::getOriginalBaseAddr() in preparation for https://reviews.llvm.org/D28518. llvm-svn: 294733	2017-02-10 10:09:44 +00:00
Tobias Grosser	e24b7b929d	[ScopInfo] Use MemoryAccess::getScopArrayInfo() interface to access Array [NFC] By using the public interface MemoryAccess::getScopArrayInfo() we avoid the direct access to the ScopArrayInfoMap and as a result also do not need to use the BasePtr as key. This change makes the code cleaner. The const-cast we introduce is a little ugly. We may consider to drop const correctness for getScopArrayInfo() at some point. This change removes unnecessary uses of MemoryAddress::getBaseAddr() in preparation for https://reviews.llvm.org/D28518. llvm-svn: 294655	2017-02-09 23:24:57 +00:00
Tobias Grosser	9c7d181c92	[ScopInfo] Use types instead of 'auto' and use more descriptive variable names [NFC] LLVM's coding conventions suggest to use auto only in obvious cases. Hence, we move this code to actually declare the types used. We also replace the variable name 'SAI', with the name 'Array', as this improves readability. llvm-svn: 294654	2017-02-09 23:24:54 +00:00
Tobias Grosser	889830b1c5	[ScopInfo] Use ScopArrayInfo instead of base address When building alias groups, we sort different ScopArrays into unrelated groups. Historically we identified arrays through their base pointer, as no ScopArrayInfo class was yet available. This change changes the alias group construction to reference arrays through their ScopArrayInfo object. This change removes unnecessary uses of MemoryAddress::getBaseAddr() in preparation for https://reviews.llvm.org/D28518. llvm-svn: 294649	2017-02-09 23:12:22 +00:00
Tobias Grosser	be372d5a04	[ScopInfo] Expect the OriginalBaseAddr when looking at underlying instructions [NFC] During SCoP construction we sometimes inspect the underlying IR by looking at the base address of a MemoryAccess. In such cases, we always want the original base address. Make this clear by calling getOriginalBaseAddr(). This is a non-functional change as getBaseAddr maps to getOriginalBaseAddr at the moment. This change removes unnecessary uses of MemoryAddress::getBaseAddr() in preparation for https://reviews.llvm.org/D28518. llvm-svn: 294576	2017-02-09 10:11:58 +00:00
Tobias Grosser	e0e0e4d4f6	[ScopInfo] Remove unnecessary indirection through SCEV [NFC] The base address of a memory access is already an llvm::Value. Hence, there is no need to go through SCEV, but we can directly work with the llvm::Value. Also use 'Value *' instead of 'auto' for cases where the type is not obvious. llvm-svn: 294575	2017-02-09 09:34:46 +00:00
Tobias Grosser	114f6d6ff5	[DependenceInfo] Use ScopArrayInfo to keep track of arrays [NFC] When computing reduction dependences we first identify all ScopArrays which are part of reductions and then only compute for these ScopArrays the more detailed data dependences that allow us to identify reductions and optimize across them. Instead of using the base pointer as identifier of a ScopArray, it is clearer and more understandable to directly use the ScopArray as identifier. This change implements such a switch. This change removes unnecessary uses of MemoryAddress::getBaseAddr() in preparation for https://reviews.llvm.org/D28518. llvm-svn: 294567	2017-02-09 08:06:05 +00:00
Tobias Grosser	ff40087a6a	Update to recent formatting changes llvm-svn: 293756	2017-02-01 10:12:09 +00:00
Daniel Jasper	baaa152294	Fix format after recent clang-format change. llvm-svn: 293753	2017-02-01 09:31:42 +00:00
Tobias Grosser	77363965c0	[ScopDetectionDiagnostic] Add meaningfull enduser message for regions with entry block Before this change the user only saw "Unspecified Error", when a region contained the entry block. Now we report: "Scop contains function entry (not yet supported)." llvm-svn: 293169	2017-01-26 10:41:37 +00:00
Tobias Grosser	64bbb1357f	ScopDetectionDiagnostics: Also emit diagnostics in case no debug info is available In this case, we just use the start of the scop as the debug location. llvm-svn: 293165	2017-01-26 10:30:55 +00:00
Tobias Grosser	e1ff0cf2eb	Relax assert when setting access functions with invariant base pointers Summary: Instead of forbidding such access functions completely, we verify that their base pointer has been hoisted and only assert in case the base pointer was not hoisted. I was trying for a little while to get a test case that ensures the assert is correctly fired in case of invariant load hoisting being disabled, but I could not find a good way to do so, as llvm-lit immediately aborts if a command yields a non-zero return value. As we do not generally test our asserts, not having a test case here seems OK. This resolves http://llvm.org/PR31494 Suggested-by: Michael Kruse <llvm@meinersbur.de> Reviewers: efriedma, jdoerfert, Meinersbur, gareevroman, sebpop, zinob, huihuiz, pollydev Reviewed By: Meinersbur Differential Revision: https://reviews.llvm.org/D28798 llvm-svn: 292213	2017-01-17 12:00:42 +00:00
Eli Friedman	71329901ea	Tidy up getFirstNonBoxedLoopFor [NFC] Move the function getFirstNonBoxedLoopFor which is used in ScopBuilder and in ScopInfo to Support/ScopHelpers to make it reusable in other locations. No functionality change. Patch by Sameer Abu Asal. Differential Revision: https://reviews.llvm.org/D28754 llvm-svn: 292168	2017-01-16 22:54:29 +00:00
Tobias Grosser	0032d87337	ScopInfo: document base pointers in alias-checks must be invariant [NFC] Before this change, this code has been mixed with a check for non-affine loops (and when originally introduce was also duplicated). By creating a separate loop and explicitly documenting this property, the current behavior becomes a lot more clear. llvm-svn: 292140	2017-01-16 15:49:14 +00:00
Tobias Grosser	f3c145f2ab	ScopInfo: Improve comments in buildAliasGroup [NFC] llvm-svn: 292139	2017-01-16 15:49:09 +00:00
Tobias Grosser	77f3257b41	ScopInfo: split out construction of a single alias group [NFC] The loop body in buildAliasGroups is still too large to easily scan it. Hence, we split the loop body out into a separate function to improve readability. llvm-svn: 292138	2017-01-16 15:49:07 +00:00
Tobias Grosser	e95222343c	ScopInfo: Do not modify the original alias group [NFC] Instead of modifying the original alias group and repurposing it as read-write access group when splitting accesses in read-only and read-write accesses, we just keep all three groups: the original alias group, the set of read-only accesses and the set of read-write accesses. This allows us to remove some complicated iterator handling and also allows for more code-reuse in calculateMinMaxAccess. llvm-svn: 292137	2017-01-16 15:49:04 +00:00
Tobias Grosser	457eb579dd	ScopInfo: No need to keep ReadOnlyAccesses in an additional map [NFC] It seems over time we added an additional map that maps from the base address of a read-only access to the actual access. However this map is never used. Drop the creation and use of this map to simplify our alias check generation code. llvm-svn: 292126	2017-01-16 14:24:48 +00:00
Tobias Grosser	dba2206b65	ScopInfo: no need to clear alias group explicitly The alias group will anyhow be cleared at the end of this function and is not used afterwards. We avoid an explicit clear() call at multiple places to improve readability of this code. llvm-svn: 292125	2017-01-16 14:13:01 +00:00
Tobias Grosser	21a059af09	Adjust formatting to commit r292110 [NFC] llvm-svn: 292123	2017-01-16 14:08:10 +00:00
Tobias Grosser	92fd612c84	ScopInfo: Fold SmallVectors used in alias check generation back into loop [NFC] Hoisting small vectors out of a loop seems to be a pure performance optimization, which is unlikely to have great impact in practice. As this hoisting just increases code-complexity, we fold the SmallVectors back into the loop. In subsequent commits, we will further simplify and structure this code, but we committed this change separately to provide an explanation to make clear that we purposefully reverted this optimization. llvm-svn: 292122	2017-01-16 14:08:02 +00:00
Tobias Grosser	e39f9127f9	ScopInfo: Extract out splitAliasGroupsByDomain [NFC] The function buildAliasGroups got very large. We extract out the splitting of alias groups to reduce its size and to better document the current behavior. llvm-svn: 292121	2017-01-16 14:08:00 +00:00
Tobias Grosser	9edcf07e83	ScopInfo: Extract out buildAliasGroupsForAccesses [NFC] The function buildAliasGroups got very large. We extract out the actual construction of alias groups to reduce its size and to better document the current behavior. llvm-svn: 292120	2017-01-16 14:07:57 +00:00
Hongbin Zheng	6aded2a0e4	Fix compilation on MSVC, NFC Differential Revision: https://reviews.llvm.org/D28739 llvm-svn: 292067	2017-01-15 16:47:26 +00:00
Tobias Grosser	4d5a917287	Use typed enums to model MemoryKind and move MemoryKind out of ScopArrayInfo To benefit of the type safety guarantees of C++11 typed enums, which would have caught the type mismatch fixed in r291960, we make MemoryKind a typed enum. This change also allows us to drop the 'MK_' prefix and to instead use the more descriptive full name of the enum as prefix. To reduce the amount of typing needed, we use this opportunity to move MemoryKind from ScopArrayInfo to a global scope, which means the ScopArrayInfo:: prefix is not needed. This move also makes historically sense. In the beginning of Polly we had different MemoryKind enums in both MemoryAccess and ScopArrayInfo, which were later canonicalized to one. During this canonicalization we just choose the enum in ScopArrayInfo, but did not consider to move this shared enum to global scope. Reviewed-by: Michael Kruse <llvm@meinersbur.de> Differential Revision: https://reviews.llvm.org/D28090 llvm-svn: 292030	2017-01-14 20:25:44 +00:00
Eli Friedman	c6e3b6f156	Delete stray isl_map_dump call. llvm-svn: 291521	2017-01-10 01:08:11 +00:00
Tobias Grosser	cdbe5c9d6c	Fix some typos in comments llvm-svn: 291247	2017-01-06 17:30:34 +00:00
Tobias Grosser	b6945e3301	Fix clang-format llvm-svn: 290103	2016-12-19 14:06:40 +00:00
Daniel Jasper	e5f3eba9c3	Fix format after recent clang-format change. llvm-svn: 290085	2016-12-19 07:54:15 +00:00
Michael Kruse	7037fde427	Remove references to AssumptionCache. NFC. The AssumptionCache was removed in r289756 after being replaced by the an addtional operand list of affected values in r289755. The absence of that cache means that we have now have to manually search for llvm.assume intrinsics as now done by other passes (LazyValueInfo, CodeMetrics) do not take into account an llvm::Instruction's user lists (ScalarEvolution). llvm-svn: 289791	2016-12-15 09:25:14 +00:00
Tobias Grosser	b02b6a8404	Adjust clang-format formatting to r289531 clang-format has been updated in r289531 to keep labels and values on the same line. This change updates Polly to the new formatting style. llvm-svn: 289533	2016-12-13 12:44:00 +00:00
Johannes Doerfert	bda814350a	Allow to disable unsigned operations (zext, icmp ugt, ...) Unsigned operations are often useful to support but the heuristics are not yet tuned. This options allows to disable them if necessary. llvm-svn: 288521	2016-12-02 17:55:41 +00:00
Johannes Doerfert	a94ae1aede	Do not allow multiple possibly aliasing ptrs in an expression Relational comparisons should not involve multiple potentially aliasing pointers. Similarly this should hold for switch conditions and the two conditions involved in equality comparisons (separately!). This is a heuristic based on the C semantics that does only allow such operations when the base pointers do point into the same object. Since this makes aliasing likely we will bail out early instead of producing a probably failing runtime check. llvm-svn: 288516	2016-12-02 17:49:52 +00:00
Tobias Grosser	bedef00e2c	[ScopInfo] Fold constant coefficients in array dimensions to the right This allows us to delinearize code such as the one below, where the array sizes are A[][2 * n] as there are n times two elements in the innermost dimension. Alternatively, we could try to generate another dimension for the struct in the innermost dimension, but as the struct has constant size, recovering this dimension is easy. struct com { double Real; double Img; }; void foo(long n, struct com A[][n]) { for (long i = 0; i < 100; i++) for (long j = 0; j < 1000; j++) A[i][j].Real += A[i][j].Img; } int main() { struct com A[100][1000]; foo(1000, A); llvm-svn: 288489	2016-12-02 08:10:56 +00:00
Tobias Grosser	491b799a4d	[ScopInfo] Separate construction and finalization of memory accesses [NFC] After having built memory accesses we perform some additional transformations on them to increase the chances that our delinearization guesses the right shape. Only after these transformations, we take the assumptions that the array shape we predict is such that no out-of-bounds memory accesses arise. Before this change, the construction of the memory access, the access folding that improves the represenation for certain parametric subscripts, and taking the assumption was all done right after a memory access was created. In this change we split this now into three separate iterations over all memory accesses. This means only after all memory accesses have been built, we start to canonicalize accesses, and to take assumptions. This split prepares for future canonicalizations that must consider all memory accesses for deriving additional beneficial transformations. llvm-svn: 288479	2016-12-02 05:21:22 +00:00
Johannes Doerfert	b1d6608430	[NFC] Check for feasibility prior to the profitability check Feasibility is checked late on its own but early it is hidden behind the "PollyProcessUnprofitable" guard. This change will make sure we opt out early if the runtime context is infeasible anyway. llvm-svn: 288329	2016-12-01 11:12:14 +00:00
Michael Kruse	11c5e07925	canSynthesize: Remove unused argument LI. NFC. The helper function polly::canSynthesize() does not directly use the LoopInfo analysis, hence remove it from its argument list. llvm-svn: 288144	2016-11-29 15:11:04 +00:00
Tobias Grosser	278f9e7d27	[ScopInfo] Use SCEVRewriteVisitor to simplify SCEVSensitiveParameterRewriter [NFC] llvm-svn: 287984	2016-11-26 17:58:40 +00:00
Tobias Grosser	b45ae5601b	[ScopDetect] Expand statistics of the detected scops We now collect: Number of total loops Number of loops in scops Number of scops Number of scops with maximal loop depth 1 Number of scops with maximal loop depth 2 Number of scops with maximal loop depth 3 Number of scops with maximal loop depth 4 Number of scops with maximal loop depth 5 Number of scops with maximal loop depth 6 and larger Number of loops in scops (profitable scops only) Number of scops (profitable scops only) Number of scops with maximal loop depth 1 (profitable scops only) Number of scops with maximal loop depth 2 (profitable scops only) Number of scops with maximal loop depth 3 (profitable scops only) Number of scops with maximal loop depth 4 (profitable scops only) Number of scops with maximal loop depth 5 (profitable scops only) Number of scops with maximal loop depth 6 and larger (profitable scops only) These statistics are certainly completely accurate as we might drop scops when building up their polyhedral representation, but they should give a good indication of the number of scops we detect. llvm-svn: 287973	2016-11-26 07:37:46 +00:00
Tobias Grosser	5c00b0dc74	[ScopDetectionDiagnostic] Collect statistics for each diagnostic type Our original statistics were added before we introduced a more fine-grained diagnostic system, but the granularity of our statistics has never been increased accordingly. This change introduces now one statistic counter per diagnostic to enable us to collect fine-grained statistics about who certain scops are not detected. In case coarser grained statistics are needed, the user is expected to combine counters manually. llvm-svn: 287968	2016-11-26 05:53:09 +00:00
Tobias Grosser	c64269ea1b	[ScopDectionDiagnostic] Use scoped enums instead three letter prefix [NFC] This improves readability of the code. llvm-svn: 287963	2016-11-26 03:44:31 +00:00
Hongbin Zheng	4ff1c3983d	Fix typo. llvm-svn: 287819	2016-11-23 21:59:33 +00:00
Hongbin Zheng	a8fb73fc0b	Split ScopInfo::addScopStmt into two versions. NFC One for adding statement for region, another one for BB llvm-svn: 287566	2016-11-21 20:09:40 +00:00
Tobias Grosser	1f0236d8e5	[ScopDetect] Use mayReadOrWriteMemory to shorten condition llvm-svn: 287525	2016-11-21 09:07:30 +00:00
Tobias Grosser	b94e9b31d0	[ScopDetect] Remove unnecessary namespace qualifier llvm-svn: 287524	2016-11-21 09:04:45 +00:00
Johannes Doerfert	81aa6e882f	[NFC] Adjust naming scheme of statistic variables Suggested-by: Tobias Grosser <tobias@grosser.es> llvm-svn: 287347	2016-11-18 14:37:08 +00:00
Johannes Doerfert	6cd59e9076	Probably overwritten loads should not be considered hoistable Do not assume a load to be hoistable/invariant if the pointer is used by another instruction in the SCoP that might write to memory and that is always executed. llvm-svn: 287272	2016-11-17 22:25:17 +00:00
Johannes Doerfert	c97654681e	[FIX] Do not try to hoist memory intrinsic Since we do not necessarily treat memory intrinsics as non-affine anymore, we have to check for them explicitly before we try to hoist an access. llvm-svn: 287270	2016-11-17 22:11:56 +00:00
Johannes Doerfert	b3265a3612	[NFC] Skip over trivial assumptions Filter trivial assumptions, thus assume { : } or restrict { : 0 = 1 }, as they clutter the user output as well as the statistics. llvm-svn: 287269	2016-11-17 22:08:40 +00:00
Johannes Doerfert	cfadb2293f	[DBG] Collect statistics about statically infeasible SCoPs llvm-svn: 287263	2016-11-17 21:44:47 +00:00
Johannes Doerfert	cd195326bf	[DBG] Collect statistics about taken assumptions llvm-svn: 287261	2016-11-17 21:41:08 +00:00
Tobias Grosser	26be8e99b6	[ScopBuilder] Drop unnecessary namespace identifiers [NFC] llvm-svn: 286781	2016-11-13 21:28:13 +00:00
Tobias Grosser	70d2709b1a	[ScopDetect] Conservatively handle inaccessible memory alias attributes Commit r286294 introduced support for inaccessiblememonly and inaccessiblemem_or_argmemonly attributes to BasicAA, which we need to support to avoid undefined behavior. This change just refuses all calls which are annotated with these attributes, which is conservatively correct. In the future we may consider to model and support such function calls in Polly. llvm-svn: 286771	2016-11-13 19:27:24 +00:00
Tobias Grosser	a2f8fa33aa	[ScopDetect] Evaluate and verify branches at branch condition, not icmp The validity of a branch condition must be verified at the location of the branch (the branch instruction), not the location of the icmp that is used in the branch instruction. When verifying at the wrong location, we may accept an icmp that is defined within a loop which itself dominates, but does not contain the branch instruction. Such loops cannot be modeled as we only introduce domain dimensions for surrounding loops. To address this problem we change the scop detection to evaluate and verify SCEV expressions at the right location. This issue has been around since at least r179148 "scop detection: properly instantiate SCEVs to the place where they are used", where we explicitly set the scope to the wrong location. Before this commit the scope was not explicitly set, which probably also resulted in the scope around the ICmp to be choosen. This resolves http://llvm.org/PR30989 Reported-by: Eli Friedman <efriedma@codeaurora.org> llvm-svn: 286769	2016-11-13 19:27:04 +00:00
Tobias Grosser	f67433abd9	SCEVAffinator: pass parameter-only set to addRestriction if BB=nullptr Assumptions can either be added for a given basic block, in which case the set describing the assumptions is expected to match the dimensions of its domain. In case no basic block is provided a parameter-only set is expected to describe the assumption. The piecewise expressions that are generated by the SCEVAffinator sometimes have a zero-dimensional domain (e.g., [p] -> { [] : p <= -129 or p >= 128 }), which looks similar to a parameter-only domain, but is still a set domain. This change adds an assert that checks that we always pass parameter domains to addAssumptions if BB is empty to make mismatches here fail early. We also change visitTruncExpr to always convert to parameter sets, if BB is null. This change resolves http://llvm.org/PR30941 Another alternative to this change would have been to inspect all code to make sure we directly generate in the SCEV affinator parameter sets in case of empty domains. However, this would likely complicate the code which combines parameter and non-parameter domains when constructing a statement domain. We might still consider doing this at some point, but as this likely requires several non-local changes this should probably be done as a separate refactoring. Reported-by: Eli Friedman <efriedma@codeaurora.org> llvm-svn: 286444	2016-11-10 11:44:10 +00:00
Tobias Grosser	bbaeda3fe5	Do not allow switch statements in loop latches In r248701 "Allow switch instructions in SCoPs" support for switch statements has been introduced, but support for switch statements in loop latches was incomplete. This change completely disables switch statements in loop latches. The original commit changed addLoopBoundsToHeaderDomain to support non-branch terminator instructions, but this change was incorrect: it added a check for BI != null to the if-branch of a condition, but BI was used in the else branch es well. As a result, when a non-branch terminator instruction is encounted a nullptr dereference is triggered. Due to missing test coverage, this bug was overlooked. r249273 "[FIX] Approximate non-affine loops correctly" added code to disallow switch statements for non-affine loops, if they appear in either a loop latch or a loop exit. We adapt this code to now prohibit switch statements in loop latches even if the control condition is affine. We could possibly add support for switch statements in loop latches, but such support should be evaluated and tested separately. This fixes llvm.org/PR30952 Reported-by: Eli Friedman <efriedma@codeaurora.org> llvm-svn: 286426	2016-11-10 05:20:29 +00:00
Tobias Grosser	eba86a1208	ScopInfo: only run code needed for ASSERT in DEBUG mode Suggested-by: Johannes Doerfert llvm-svn: 286338	2016-11-09 04:24:49 +00:00
Tobias Grosser	744740ad91	ScopInfo: Ensure copy statement memory accesses are correct Add asserts that verify that the memory accesses of a new copy statement are defined for all domain instances the copy statement is defined for. llvm-svn: 286047	2016-11-05 21:02:43 +00:00
Michael Kruse	e1dc387731	[ScopInfo] Fix isl object leak. Fix return from function without releasing isl objects, which was introduced in r269055. llvm-svn: 285924	2016-11-03 15:19:41 +00:00
Eli Friedman	b9c6f01a81	[ScopInfo] Make memset etc. affine where possible. We don't actually check whether a MemoryAccess is affine in very many places, but one important one is in checks for aliasing. Differential Revision: https://reviews.llvm.org/D25706 llvm-svn: 285746	2016-11-01 20:53:11 +00:00
Tobias Grosser	ebb626e4b7	[ScopDetect] Use SCEVRewriteVisitor to simplify SCEVRemoveSMax rewriter ScalarEvolution got at some pointer a SCEVRewriteVisitor. Use it to simplify our SCEVRemoveSMax visitor. llvm-svn: 285491	2016-10-29 06:19:34 +00:00
Michael Kruse	426e6f71f8	[ScopInfo] Fix: use raw source pointer. When adding an llvm.memcpy instruction to AliasSetTracker, it uses the raw source and target pointers which preserve bitcasts. MemAccInst::getPointerOperand() also returns the raw target pointers, but Scop::buildAliasGroups() did not for the source pointer. This lead to mismatches between AliasSetTracker and ScopInfo on which pointer to use. Fixed by also using raw pointers in Scop::buildAliasGroups(). llvm-svn: 285071	2016-10-25 13:37:43 +00:00
Mandeep Singh Grang	48e7add80f	[polly] Change SmallPtrSet which are being iterated into SmallSetVector Summary: Otherwise the lack of an iteration order results in non-determinism in codegen. Reviewers: _jdoerfert, zinob, grosser Tags: #polly Differential Revision: https://reviews.llvm.org/D25863 llvm-svn: 284845	2016-10-21 17:29:10 +00:00
Michael Kruse	6a19d592da	[ScopDetect] Depend transitively on ScalarEvolution. ScopDetection might be queried by -dot-scops or -view-scops passes for which it accesses ScalarEvolution. llvm-svn: 284385	2016-10-17 13:29:20 +00:00
Michael Kruse	fa53c86dc1	[ScopInfo/CodeGen] ExitPHI reads are implicit. Under some conditions MK_Value read accessed where converted to MK_ExitPHI read accessed. This is unexpected because MK_ExitPHI read accesses are implicit after the scop execution. This behaviour was introduced in r265261, which fixed a failed assertion/crash in CodeGen. Instead, we fix this failure in CodeGen itself. createExitPHINodeMerges(), despite its name, also handles accesses of kind MK_Value, only to skip them because they access values that are usually not PHI nodes in the SCoP region's exit block. Except in the situation observed in r265261. Do not convert value accessed to ExitPHI accesses and do not handle value accesses like ExitPHI accessed in CodeGen anymore. llvm-svn: 284023	2016-10-12 16:31:09 +00:00
Michael Kruse	c9edc2ee8d	[DepInfo] Print -debug output outside of max-operations scope. ISL tries to simplify the polyhedral operations before printing its objects. This increases the operations counter and therefore can contribute to hitting the operations limit. Therefore the result could be different when -debug output is enabled, making debugging harder. llvm-svn: 283745	2016-10-10 11:45:59 +00:00
Michael Kruse	8bfba1ff46	[Support/DepInfo] Introduce IslMaxOperationsGuard and make DepInfo use it. NFC. IslMaxOperationsGuard defines a scope where ISL may abort operations because if it takes too many operations. Replace the call to the raw ISL interface by a use of the guard. IslMaxOperationsGuard provides a uniform way to define a maximal computation time for a code region in C++ using RAII. llvm-svn: 283744	2016-10-10 11:45:54 +00:00
Mehdi Amini	732afdd09a	Turn cl::values() (for enum) from a vararg function to using C++ variadic template The core of the change is supposed to be NFC, however it also fixes what I believe was an undefined behavior when calling: va_start(ValueArgs, Desc); with Desc being a StringRef. Differential Revision: https://reviews.llvm.org/D25342 llvm-svn: 283671	2016-10-08 19:41:06 +00:00
Michael Kruse	6ab4476835	[ScopInfo] Add -polly-unprofitable-scalar-accs option. With this option one can disable the heuristic that assumes that statements with a scalar write access cannot be profitably optimized. Such a statement instances necessarily have WAW-dependences to itself. With DeLICM scalar accesses can be changed to array accesses, which can avoid these WAW-dependence. llvm-svn: 283233	2016-10-04 17:33:39 +00:00
Michael Kruse	ca7cbcca37	[ScopInfo] Scalar access do not have indirect base pointers. ScopArrayInfo used to determine base pointer origins by looking up whether the base pointer is a load. The "base pointer" for scalar accesses is the llvm::Value being accessed. This is only a symbolic base pointer, it represents the alloca variable (.s2a or .phiops) generated for it at code generation. This patch disables determining base pointer origin for scalars. A test case where this caused a crash will be added in the next commit. In that test SAI tried to get the origin base pointer that was only declared later, therefore not existing. This is probably only possible for scalars used in PHINode incoming blocks. llvm-svn: 283232	2016-10-04 17:33:34 +00:00
Tobias Grosser	349d1c3368	[ScopDetection] Remove redundant checks for endless loops Summary: Both `canUseISLTripCount()` and `addOverApproximatedRegion()` contained checks to reject endless loops which are now removed and replaced by a single check in `isValidLoop()`. For reporting such loops the `ReportLoopOverlapWithNonAffineSubRegion` is renamed to `ReportLoopHasNoExit`. The test case `ReportLoopOverlapWithNonAffineSubRegion.ll` is adapted and renamed as well. The schedule generation in `buildSchedule()` is based on the following assumption: Given some block B that is contained in a loop L and a SESE region R, we assume that L is contained in R or the other way around. However, this assumption is broken in the presence of endless loops that are nested inside other loops. Therefore, in order to prevent erroneous behavior in `buildSchedule()`, r265280 introduced a corresponding check in `canUseISLTripCount()` to reject endless loops. Unfortunately, it was possible to bypass this check with -polly-allow-nonaffine-loops which was fixed by adding another check to reject endless loops in `allowOverApproximatedRegion()` in r273905. Hence there existed two separate locations that handled this case. Thank you Johannes Doerfert for helping to provide the above background information. Reviewers: Meinersbur, grosser Subscribers: _jdoerfert, pollydev Differential Revision: https://reviews.llvm.org/D24560 Contributed-by: Matthias Reisinger <d412vv1n@gmail.com> llvm-svn: 281987	2016-09-20 17:05:22 +00:00
Tobias Grosser	fe74a7a1f5	GPGPU: Detect read-only scalar arrays ... and pass these by value rather than by reference. llvm-svn: 281837	2016-09-17 19:22:18 +00:00
Roman Gareev	b3224adfb6	Perform copying to created arrays according to the packing transformation This is the fourth patch to apply the BLIS matmul optimization pattern on matmul kernels (http://www.cs.utexas.edu/users/flame/pubs/TOMS-BLIS-Analytical.pdf). BLIS implements gemm as three nested loops around a macro-kernel, plus two packing routines. The macro-kernel is implemented in terms of two additional loops around a micro-kernel. The micro-kernel is a loop around a rank-1 (i.e., outer product) update. In this change we perform copying to created arrays, which is the last step to implement the packing transformation. Reviewed-by: Tobias Grosser <tobias@grosser.es> Differential Revision: https://reviews.llvm.org/D23260 llvm-svn: 281441	2016-09-14 06:26:09 +00:00
Michael Kruse	19c9d99f45	Use value directly instead of reference. NFC. The alias to the array element is read-only and a primitive type (pointer), therefore use the value directly instead of a reference to it. llvm-svn: 281311	2016-09-13 09:56:05 +00:00
Roman Gareev	f5aff70405	Store the size of the outermost dimension in case of newly created arrays that require memory allocation. We do not need the size of the outermost dimension in most cases, but if we allocate memory for newly created arrays, that size is needed. Reviewed-by: Michael Kruse <llvm@meinersbur.de> Differential Revision: https://reviews.llvm.org/D23991 llvm-svn: 281234	2016-09-12 17:08:31 +00:00
Tobias Grosser	55a7af7da5	ScopInfo: Make clear that no double-free problem exists When running the clang static analyser to check for memory issues, this code originally showed a double free, as the analyser was unable to understand that isl_set_free always returns NULL and consequently later uses of the isl object we just freed will never be reached. Without this knowledge, the analyser has to issue a warning. We refactor the code to make it clear that for empty maps the current loop iteration is aborted. llvm-svn: 280940	2016-09-08 14:08:07 +00:00
Tobias Grosser	b316dc166f	ScopDetection: Make sure we do not accidentally divide by zero This code path is likely never triggered, but by still handling this case locally we avoid warnings in clangs static analyzer. llvm-svn: 280939	2016-09-08 14:08:05 +00:00
Tobias Grosser	adfc971820	DependenceInfo: Make clear that no double-free problem exists When running the clang static analyser to check for memory issues, this code originally showed a double free, as the analyser was unable to understand that isl_union_map_free always returns NULL and consequently later uses of the isl object we just freed will never be reached. Without this knowledge, the analyser has to issue a warning. We refactor the code to make it clear that for empty maps the current loop iteration is aborted. llvm-svn: 280938	2016-09-08 14:08:01 +00:00

... 3 4 5 6 7 ...

1376 Commits