llvm-project

Commit Graph

Author	SHA1	Message	Date
Michael Kruse	d091bf8d8e	[MatMul] Make MatMul detection independent of internal isl representations. The pattern recognition for MatMul is restrictive. The number of "disjuncts" in the isl_map containing constraint information was previously required to be 1 (as per isl_*_coalesce - which should ideally produce a domain map with a single disjunct, but does not under some circumstances). This was changed and made more flexible. Contributed-by: Annanay Agarwal <cs14btech11001@iith.ac.in> Differential Revision: https://reviews.llvm.org/D36460 llvm-svn: 311302	2017-08-20 21:31:11 +00:00
Tobias Grosser	d5f1fad77c	[Polly] Run early cse + memory SSA to remove redundancies in the input code This allows us to get rid of many identical loads as they commonly appear in Fortran code. llvm-svn: 311231	2017-08-19 08:44:46 +00:00
Andreas Simbuerger	8d5b257d02	[Polly][Bug fix] Wrong dependences filtering during Fully Indexed expansion Summary: When trying to expand memory accesses, the current version of Polly uses statement Level dependences. The actual implementation is not working in case of multiple dependences per statement. For example in the following source code : ``` void mse(double A[Ni], double B[Nj], double C[Nj], double D[Nj]) { int i,j; for (j = 0; j < Ni; j++) { for (int i = 0; i<Nj; i++) S: B[i] = i; for (int i = 0; i<Nj; i++) T: D[i] = i; U: A[j] = B[j]; C[j] = D[j]; } } ``` The statement U has two dependences with S and T. The current version of polly fails during expansion. This patch aims to fix this bug. For that, we use Reference Level dependences to be able to filter dependences according to statement and memory ref. The principle of expansion remains the same as before. We also noticed that we need to bail out if load come after store (at the same position) in same statement. So a check was added to isExpandable. Contributed by: Nicholas Bonfante <nicolas.bonfante@insa-lyon.fr> Reviewers: Meinersbur, simbuerg, bollu Reviewed By: Meinersbur, simbuerg Subscribers: pollydev, llvm-commits Differential Revision: https://reviews.llvm.org/D36791 llvm-svn: 311165	2017-08-18 15:01:18 +00:00
Siddharth Bhat	dd616e9519	[ScpInliner] Move DEBUG-TYPE to below all includes to prevent cross-module interaction. [NFC] This fixes compile errors. llvm-svn: 311130	2017-08-17 22:21:16 +00:00
Siddharth Bhat	b46847c035	[ScopInliner] Add a simple Scop-based inliner to polly. We add a ScopInliner pass which inlines functions based on a simple heuristic: Let `g` call `f`. If we can model all of `f` as a Scop, we inline `f` into `g`. This requires `-polly-detect-full-function` to be enabled. So, the pass asserts that `-polly-detect-full-function` is enabled. Differential Revision: https://reviews.llvm.org/D36832 llvm-svn: 311126	2017-08-17 21:57:23 +00:00
Tobias Grosser	ed6a4acc7f	Add rewrite by-reference parameter pass Summary: This pass detangles induction variables from functions, which take variables by reference. Most fortran functions compiled with gfortran pass variables by reference. Unfortunately a common pattern, printf calls of induction variables, prevent in this situation the promotion of the induction variable to a register, which again inhibits any kind of loop analysis. To work around this issue we developed a specialized pass which introduces separate alloca slots for known-read-only references, which indicate the mem2reg pass that the induction variables can be promoted to registers and consquently enable SCEV to work. We currently hardcode the information that a function _gfortran_transfer_integer_write does not read its second parameter, as dragonegg does not add the right annotations and we cannot change old dragonegg releases. Hopefully flang will produce the right annotations. Reviewers: Meinersbur, bollu, singam-sanjay Reviewed By: bollu Subscribers: mgorny, pollydev, llvm-commits Tags: #polly Differential Revision: https://reviews.llvm.org/D36800 llvm-svn: 311066	2017-08-17 05:25:08 +00:00
Tobias Grosser	990cbb4310	[Polly] Move Scop::restrictDomains to islpp. NFC. Reviewers: grosser, Meinersbur, bollu Differential Revision: https://reviews.llvm.org/D36659 llvm-svn: 310814	2017-08-14 06:49:01 +00:00
Reid Kleckner	8d719a27f5	Fix two warnings in polly, -Wmismatched-tags and -Wreorder llvm-svn: 310667	2017-08-10 21:46:22 +00:00
Michael Kruse	cd3b9fedc7	Remove dependency of Scop::getStmtFor(Inst) on getStmtFor(BB). NFC. We are working towards removing uses of Scop::getStmtFor(BB). In this patch, we remove dependency of Scop::getStmtFor(Inst) on getStmtFor(BB). To do so, we introduce a map of instructions to their corresponding scop statements and use it to get the instructions' statement. Contributed-by: Nandini Singhal <cs15mtech01004@iith.ac.in> Differential Revision: https://reviews.llvm.org/D35663 llvm-svn: 310494	2017-08-09 16:45:37 +00:00
Michael Kruse	36550bac0d	[ForwardOpTree] Set DEBUG_TYPE to "polly-optree". The previous value of "polly-delicm" was forgotten to to be changed when ForwardOpTree was split from DeLICM. Thanks to Tobias for noticing! llvm-svn: 310465	2017-08-09 12:27:35 +00:00
Michael Kruse	630fc7b82a	[ISLTools/ZoneAlgo] Make distributeDomain and filterKnownValInst isl_error_quota proof. distributeDomain() and filterKnownValInst() are used in a scop of ForwardOpTree that limits the number of isl operations. Therefore some isl functions may return null after any operation. Remove assertion that assume non-null results and handle isl_*_foreach returning isl::stat::error. I hope this fixes the crash of the asop buildbot at ihevc_recon.c. llvm-svn: 310461	2017-08-09 11:21:40 +00:00
Michael Kruse	8756b3fbec	[ZoneAlgo] Add motivation for exception. NFC. Suggested-by: Hongbin Zheng <etherzhhb@gmail.com> llvm-svn: 310455	2017-08-09 09:29:15 +00:00
Michael Kruse	a9033aaba2	[ZoneAlgo] Consolditate condition. NFC. No need to create an OptimizationRemarkMissed object if we are not going to use it anyway. llvm-svn: 310454	2017-08-09 09:29:09 +00:00
Michael Kruse	ce67358281	[DeLICM/ZoneAlgo] Remove duplicate code. NFC. DeLICM and ZoneAlgo both implemented filterKnownValInst. Declare ZoneAlgo's version in the header and let DeLCIM use it. llvm-svn: 310381	2017-08-08 17:00:27 +00:00
Roman Gareev	dbde718676	Do not use isl_set_project_out to get all loop prefixes Currently, only convex isolation sets can be efficiently processed by isl. Consequently, as a temporary solution, we use a different algorithm for partial tile isolation that helps to build convex isolation sets in some cases. Reviewed-by: Tobias Grosser <tobias@grosser.es> Differential Revision: https://reviews.llvm.org/D36278 llvm-svn: 310374	2017-08-08 16:15:33 +00:00
Michael Kruse	27c010a22e	[DeLICM] Properly handle PHI writes becoming empty partial writes. It is possible that partial writes are empty (write is never executed). In this case, when in PHINode's incoming edge is never taken such that the incoming write becomes an empty partial write, if enabled. The issue is that when converting the union_map to an map, it's space cannot be derived from the union_map itself. Rather, we need to determine its space independently. This fixes test-suite's MultiSource/Benchmarks/ASC_Sequoia/CrystalMk. llvm-svn: 310348	2017-08-08 11:27:12 +00:00
Tobias Grosser	327e9ecb0d	[ScheduleOptimizer] Make matmul pattern detection work with delicm output In certain cases delicm might decide to not leave the original array write in the loop body, but to remove it and instead leave a transformed phi node as write access. This commit teached the matmul pattern detection to order the memory accesses according to when the access actually happens and use this information to detect the new pattern. This makes pattern based matmul optimization work for 2mm and 3mm in polybench 4 after polly-position=before-vectorizer has been enabled. llvm-svn: 310338	2017-08-08 06:15:15 +00:00
Tobias Grosser	32f64ed22b	[DeLICM] Enable partial writes This allows us to remove more scalar dependences. While this feature is still rather experimental, we want to give it sufficient test coverage. llvm-svn: 310314	2017-08-07 22:06:07 +00:00
Tobias Grosser	2ef378120d	[ZoneAlgo] Allow two writes that write identical values into same array slot Two write statements which write into the very same array slot generally are conflicting. However, in case the value that is written is identical, this does not cause any problem. Hence, allow such write pairs in this specific situation. llvm-svn: 310311	2017-08-07 22:01:29 +00:00
Andreas Simbuerger	81fb6b3e40	[Polly] Fully-Indexed static expansion This commit implements the initial version of fully-indexed static expansion. ``` for(int i = 0; i<Ni; i++) for(int j = 0; j<Ni; j++) S: B[j] = j; T: A[i] = B[i] ``` After the pass, we want this : ``` for(int i = 0; i<Ni; i++) for(int j = 0; j<Ni; j++) S: B[i][j] = j; T: A[i] = B[i][i] ``` For now we bail (fail) in the following cases: - Scalar access - Multiple writes per SAI - MayWrite Access - Expansion that leads to an access to the original array Furthermore: We still miss checks for escaping references to the array base pointers. A future commit will add the missing escape-checks to stay correct in those cases. The expansion is still locked behind a CLI-Option and should not yet be used. Patch contributed by: Nicholas Bonfante <bonfante.nicolas@gmail.com> Reviewers: simbuerg, Meinersbur, bollu Reviewed By: Meinersbur Subscribers: mgorny, llvm-commits, pollydev Differential Revision: https://reviews.llvm.org/D34982 llvm-svn: 310304	2017-08-07 20:54:20 +00:00
Michael Kruse	70af4f579d	[ForwardOpTree] Use known array content analysis to forward load instructions. This is an addition to the -polly-optree pass that reuses the array content analysis from DeLICM to find array elements that contain the same value as the value loaded when the target statement instance is executed. The analysis is now enabled by default. The known content analysis could also be used to rematerialize any llvm::Value that was written to some array element, but currently only loads are forwarded. Differential Revision: https://reviews.llvm.org/D36380 llvm-svn: 310279	2017-08-07 18:40:29 +00:00
Tobias Grosser	61bd3a4840	[ScopInfo] Move Scop::getPwAffOnly to isl++ [NFC] llvm-svn: 310231	2017-08-06 21:42:38 +00:00
Tobias Grosser	31df6f31c0	[ScopInfo] Move Scop::getDomains to isl++ [NFC] llvm-svn: 310230	2017-08-06 21:42:25 +00:00
Tobias Grosser	b65ccc4302	[ScopInfo] Translate Scop::getParamSpace to isl++ [NFC] llvm-svn: 310224	2017-08-06 20:11:59 +00:00
Tobias Grosser	5ab39ff224	[ScopInfo] Move get*Writes/getReads/getAccesses to isl++ llvm-svn: 310219	2017-08-06 19:22:27 +00:00
Tobias Grosser	85048eff1a	[ScopInfo] Move ScopStmt::ScopStmt to isl++ [NFC] llvm-svn: 310210	2017-08-06 17:24:59 +00:00
Tobias Grosser	dcf8d696ff	Move ScopInfo::getDomain(), getDomainSpace(), getDomainId() to isl++ llvm-svn: 310209	2017-08-06 16:39:52 +00:00
Tobias Grosser	feae3dfe9f	[unittests] Add unittest for getPartialTilePrefixes In https://reviews.llvm.org/D36278 it was pointed out that the behavior of getPartialTilePrefixes is not very well understood. To allow for a better understanding, we first provide some basic unittests. llvm-svn: 310175	2017-08-05 09:38:09 +00:00
Michael Kruse	138a3fbae1	[DeLICM] Refactor ZoneAlgorithm into ZoneAlgo.cpp. NFC. Extract ZoneAlgorithm from DeLICM.cpp into its own file. It will gain a second use by the load forwarding part of -polly-optree. llvm-svn: 310146	2017-08-04 22:51:23 +00:00
Michael Kruse	a9a7086319	[ForwardOpTree] Refactor out forwardSpeculatable(). NFC. The method forwardSpeculatable forwards speculatively executable instructions and is currently the only way to forward an instruction. In the future we intend to add more methods. llvm-svn: 310056	2017-08-04 12:28:42 +00:00
Tobias Grosser	7b45af13ce	Move setNewAccessRelation to isl++ llvm-svn: 309871	2017-08-02 19:27:25 +00:00
Philip Pfaffe	a70e2649ab	[Polly][PM][WIP] Polly pass registration Summary: This patch is a first attempt at registering Polly passes with the LLVM tools. Tool plugins are still unsupported, but this registration is usable from the tools if Polly is linked into them (albeit requiring minimal patches to those tools). Registration requires a small amount of machinery (the owning analysis proxies), necessary for injecting ScopAnalysisManager objects into the calling tools. This patch is marked WIP because the registration is incomplete. Parsing manual pipelines is fully supported, but default pass injection into the O3 pipeline is lacking, mostly because there is opportunity for some redesign here, I believe. The first point of order would be insertion points. I think it makes sense to run before the vectorizers. Running Polly Early, however, is weird. Mostly because it actually is the default (which to me is unexpected), and because Polly runs it's own O1 pipeline. Why not instead insert it at an appropriate place somewhere after simplification happend? Running after the loop optimizers seems intuitive, but it also seems wasteful, since multiple consecutive loops might well be a single scop, and we don't need to run for all of them. My second request for comments would be regarding all those smallish helper passes we have, like PollyViewer, PollyPrinter, PollyImportJScop. Right now these are controlled by command line options, deciding whether they should be part of the Polly pipeline. What is your opinion on treating them like real passes, and have the user write an appropriate pipeline if they want to use any of them? Reviewers: grosser, Meinersbur, bollu Reviewed By: grosser Subscribers: llvm-commits, pollydev Tags: #polly Differential Revision: https://reviews.llvm.org/D35458 llvm-svn: 309826	2017-08-02 15:52:25 +00:00
Michael Kruse	fd35089689	[ForwardOpTree] Execute canForwardTree also in release builds. Commit r309730 moved the call to canForwardTree into an assert(), even though this function has side-effects if its DoIt parameter is true. To avoid a warning in release builds, do an (void)Execution of its result instead. To avoid such confusion in the future, rename canForwardTree() to forwardTree(). llvm-svn: 309753	2017-08-01 22:15:04 +00:00
Michael Kruse	bc88a78cb4	[Simplify] Rewrite redundant write detection algorithm. The previous algorithm was to search a writes and the sours of its value operand, and see whether the write just stores the same read value back, which includes a search whether there is another write access between them. This is O(n^2) in the max number of accesses in a statement (+ the complexity of isl comparing the access functions). The new algorithm is more similar to the one used for searching for overwrites and coalescable writes. It scans over all accesses in order of execution while tracking which array elements still have the same value since it was read. This is O(n), not counting the complexity within isl. It should be more reliable than trying to catch all non-conforming cases in the previous approach. It is also less code. We now also support if the write is a partial write of the read's domain, and to some extent non-affine subregions. Differential Revision: https://reviews.llvm.org/D36137 llvm-svn: 309734	2017-08-01 20:01:34 +00:00
Reid Kleckner	859c1e606a	Silence -Wunused-variable warning in NDEBUG builds llvm-svn: 309730	2017-08-01 19:53:01 +00:00
Michael Kruse	693ef99935	[Simplify] Improve scalability. With a lot of reads and writes to the same array in a statement, some isl sets that capture the state between access can become complex such that isl takes more considerable time and memory for operations on them. The problems identified were: - is_subset() takes considerable time with many disjoints in the arguments. We limit the number of disjoints to 4, any additional information is thrown away. - subtract() can lead to many disjoints. We instead assume that any array element is possibly accessed, which removes all disjoints. - subtract_domain() may lead to considerable processing, even if all elements are are to be removed. Instead, we remove determine and remove the affected spaces manually. No behaviour is changed. llvm-svn: 309728	2017-08-01 19:39:11 +00:00
Michael Kruse	9f6e41cdba	[ForwardOpTree] Support synthesizable values. This allows -polly-optree to move instructions that depend on synthesizable values. The difficulty for synthesizable values is that their value depends on the location. When it is moved over a loop header, and the SCEV expression depends on the loop induction variable (SCEVAddRecExpr), it would use the current induction variable instead of the last one. At the moment we cannot forward PHI nodes such that crossing the header of loops referenced by SCEVAddRecExpr is not possible (assuming the loop header has at least two incoming blocks: for entering the loop and the backedge, such any instruction to be forwarded must have a phi between use and definition). A remaining issue is when the forwarded value is used after the loop, but is only synthesizable inside the loop. This happens e.g. if ScalarEvolution is unable to determine the number of loop iterations or the initial loop value. We do not forward in this situation. Differential Revision: https://reviews.llvm.org/D36102 llvm-svn: 309609	2017-07-31 19:46:21 +00:00
Michael Kruse	57cc92b790	[Simplify] Remove all kinds of redundant scalar writes. In addition to array and PHI writes, also allow scalar value writes. The only kind of write not allowed are writes by functions (including memcpy/memmove/memset). llvm-svn: 309582	2017-07-31 17:04:55 +00:00
Michael Kruse	ce9617f4fe	[Simplify] Implement write accesses coalescing. Write coalescing combines write accesses that - Write the same llvm::Value. - Write to the same array. - Unless they do not write anything in a statement instance (partial writes), write to the same element. - There is no other access between them that accesses the same element. This is particularly useful after DeLICM, which leaves partial writes to disjoint domains. Differential Revision: https://reviews.llvm.org/D36010 llvm-svn: 309489	2017-07-29 16:21:16 +00:00
Michael Kruse	6c8f91b908	[Simplify] Fix typo in statistics output. NFC. llvm-svn: 309402	2017-07-28 16:57:51 +00:00
Michael Kruse	34a77780c5	[Simplify] Remove empty partial accesses first. NFC. So follow-up cleanup do not need special handling for such accesses. llvm-svn: 309401	2017-07-28 16:57:45 +00:00
Michael Kruse	cedd7a74e1	[Simplify] Do not setInstructions() of region stmts. NFC. The instruction list is ignored for region statements, there is no reason to set it. llvm-svn: 309196	2017-07-26 22:01:28 +00:00
Roman Gareev	2e580538be	[ScheduleOptimizer] Translate to C++ bindings Translate the ScheduleOptimizer to use the new isl C++ bindings. Reviewed-by: Michael Kruse <llvm@meinersbur.de> Differential Revision: https://reviews.llvm.org/D35845 llvm-svn: 309119	2017-07-26 14:59:15 +00:00
Tobias Grosser	d7065e5df5	Move MemoryAccess::isStride* to isl++ llvm-svn: 308927	2017-07-24 20:50:22 +00:00
Michael Kruse	54071126d8	[ForwardOpTree] Properly indent enumeration in comment. NFC. llvm-svn: 308887	2017-07-24 15:34:03 +00:00
Michael Kruse	67752076bc	[ForwardOpTree] Rename FD_CanForward to FD_CanForwardLeaf. NFC. To make the meaning and distinction to FD_CanForwardTree clearer. llvm-svn: 308886	2017-07-24 15:33:58 +00:00
Michael Kruse	d85e345ce0	[ForwardOpTree] Add comments to ForwardingDecision items. NFC. In particular, explain the difference between FD_CanForward and FD_CanForwardTree. llvm-svn: 308885	2017-07-24 15:33:53 +00:00
Michael Kruse	07e8c36dc7	[ForwardOpTree] Support read-only value uses. Read-only values (values defined before the SCoP) require special handing with -polly-analyze-read-only-scalars=true (which is the default). If active, each use of a value requires a read access. When a copied value uses a read-only value, we must also ensure that such a MemoryAccess is available or is created. Differential Revision: https://reviews.llvm.org/D35764 llvm-svn: 308876	2017-07-24 12:43:27 +00:00
Michael Kruse	5b8a9095e8	[ForwardOpTree] Fix mixup in comment. NFC. The cases DoIt==false and DoIt==true were mixed up. Thanks to Siddharth for noticing. llvm-svn: 308874	2017-07-24 12:39:46 +00:00
Michael Kruse	25a688165b	[ScopInfo] Fix typo in method name. NFC. prependInstrunction -> prependInstruction Thanks Nandini for noticing. llvm-svn: 308873	2017-07-24 12:39:41 +00:00
Tobias Grosser	325812ac6d	Simplify: Adopt for translation of MemoryAccess::getAccessRelation For some reason this one was missed earlier. llvm-svn: 308845	2017-07-23 08:15:28 +00:00
Tobias Grosser	1515f6b937	Move MemoryAccess::NewAccessRelation to isl++ We also move related accessor functions llvm-svn: 308840	2017-07-23 04:08:38 +00:00
Michael Kruse	ab8f0d57df	[Simplify] Remove partial write accesses with empty domain. If the access relation's domain is empty, the access will never be executed. We can just remove it. We only remove write accesses. Partial read accesses are not yet supported and instructions in the statement might require the llvm::Value holding the read's result to be defined. llvm-svn: 308830	2017-07-22 20:33:09 +00:00
Michael Kruse	e5f4706a55	[ForwardOpTree] Support hoisted invariant loads. Hoisted loads can be trivially supported because there are no MemoryAccess to be modified, the loaded value is just available at code generation. llvm-svn: 308826	2017-07-22 14:30:02 +00:00
Michael Kruse	a6b2de3b59	[ForwardOpTree] Introduce the -polly-optree pass. This pass 'forwards' operand trees into statements that use them in order to avoid scalar dependencies. This minimal implementation handles only the case of speculatable instructions. We will successively add support for: - Hoisted loads - Read-only values - Synthesizable values - Loads - PHIs - Forwarding only parts of the tree Differential Revision: https://reviews.llvm.org/D35754 llvm-svn: 308825	2017-07-22 14:02:47 +00:00
Tobias Grosser	77eef90f50	Move ScopArrayInfo to isl++ This moves the full ScopArrayInfo class to isl++ llvm-svn: 308801	2017-07-21 23:07:56 +00:00
Michael Kruse	cd4c977b8b	[ScopInfo] Print instructions in dump(). Print a statement's instruction on dump() regardless of -polly-print-instructions. dump() is supposed to be used in the debugger only and never in regression tests. While debugging, get all the information we have and we are not bound to break anything. For non-dump purposes of print, forward the setting of -polly-print-instructions as parameters. Some calls to print() had to be changed because the PollyPrintInstructions setting is only available in ScopInfo.cpp. In ScheduleOptimizer.cpp, dump() was used in regression tests. That's not what dump() is for. The print parameter "PrintInstructions" will also be useful for an explicit print SCoP pass in a future patch. llvm-svn: 308746	2017-07-21 15:35:53 +00:00
Michael Kruse	22058c3fbb	[Simplify] Remove unused instructions and accesses. Use a mark-and-sweep algorithm to find and remove unused instructions and MemoryAccesses. This is useful in particular to remove scalar writes that are never used anywhere. A scalar write in a loop induces a write-after-write dependency that stops the loop iterations to be rescheduled. Such writes can be a result of previous transformations such as DeLICM and operand tree forwarding. It adds a new class VirtualInstruction that represents an instruction in a particular statement. At the moment an instruction can only belong to the statement that represents a BasicBlock. In the future, instructions can be in one of multiple statements representing a BasicBlock (Nandini's work), in different statements than its BasicBlock would indicate, and even multiple statements at once (by forwarding operand trees). It also integrates nicely with the VirtualUse class. ScopStmt::contains(Instruction*) currently uses the instruction's parent BasicBlock to check whether it contains the instruction. It will need to check the actual statement list when one of the aforementioned features become possible. Differential Revision: https://reviews.llvm.org/D35656 llvm-svn: 308626	2017-07-20 16:21:55 +00:00
Michael Kruse	8b8058072f	[ScopInfo] Integrate ScalarDefUseChain into polly::Scop. NFC. Before this patch, ScalarDefUseChain was a tool used by DeLICM to find all reads and writes of scalar accesses. It iterated once over all accesses and stores the accesses into maps. By integrating it into the Scop class, we can keep the maps up-to-date without the need for recomputing them. It will be needed for more than DeLICM in the future, such as SCoP simplification, code movement between virtual statements, and array expansion (GSoC project). Compared to ScalarUseDefChain, we save two maps by finding the ScopStmt a Def/PHIRead must reside in, and use its already existing lookup function to find the MemoryAccess. Differential Revision: https://reviews.llvm.org/D35631 llvm-svn: 308495	2017-07-19 17:11:25 +00:00
Roman Gareev	750374181b	Make the pattern matching work with modified memory accesses Some optimizations (e.g., DeLICM) can modify memory accesses (e.g., change their MemoryKind). Consequently, the pattern matching should take it into the account. Reviewed-by: Tobias Grosser <tobias@grosser.es>, Michael Kruse <llvm@meinersbur.de> Differential Revision: https://reviews.llvm.org/D33138 llvm-svn: 308494	2017-07-19 16:59:06 +00:00
Michael Kruse	629f9185bf	[Simplify] Ensure all counters are reset before next SCoP is processed. NFC. llvm-svn: 308473	2017-07-19 14:07:21 +00:00
Tobias Grosser	303bd07c6e	[ScopInfo] Introduce tryGetValueStored Summary: This makes code more readable and allows to reuse this functionality in the future at other places. Suggested-by Michael Kruse in post-commit review of r307660. Reviewers: Meinersbur, bollu, gareevroman, efriedma, huihuiz, sebpop, simbuerg Reviewed By: Meinersbur Subscribers: pollydev, llvm-commits Tags: #polly Differential Revision: https://reviews.llvm.org/D35585 llvm-svn: 308435	2017-07-19 11:09:16 +00:00
Tobias Grosser	8e1280b8b2	[Polly] Fix a typo [NFC] Reviewers: grosser, Meinersbur, bollu Tags: #polly Contributed-by: Nandini Singhal <cs15mtech01004@iith.ac.in> Differential Revision: https://reviews.llvm.org/D35459 llvm-svn: 308134	2017-07-16 13:54:41 +00:00
Tobias Grosser	bed2ca6eac	[Simplify] Also remove redundant writes which originally came from PHI nodes llvm-svn: 307660	2017-07-11 14:29:39 +00:00
Singapuram Sanjay Srivallabh	02ca346e48	Introduce a hybrid target to generate code for either the GPU or CPU Summary: Introduce a "hybrid" `-polly-target` option to optimise code for either the GPU or CPU. When this target is selected, PPCGCodeGeneration will attempt first to optimise a Scop. If the Scop isn't modified, it is then sent to the passes that form the CPU pipeline, i.e. IslScheduleOptimizerPass, IslAstInfoWrapperPass and CodeGeneration. In case the Scop is modified, it is marked to be skipped by the subsequent CPU optimisation passes. Reviewers: grosser, Meinersbur, bollu Reviewed By: grosser Subscribers: kbarton, nemanjai, pollydev Tags: #polly Differential Revision: https://reviews.llvm.org/D34054 llvm-svn: 306863	2017-06-30 19:42:21 +00:00
Tobias Grosser	dcd94e3e93	[ScheduleOptimizer] Fix minor typo [NFC] llvm-svn: 305709	2017-06-19 16:55:48 +00:00
Tobias Grosser	2fb3ed200a	[ScheduleOptimizer] Move isolateFullPartialTiles and isolateAndUnrollMatMulInnerLoops to C++ llvm-svn: 305676	2017-06-19 10:40:12 +00:00
Michael Kruse	a6d48f59a1	Fix a lot of typos. NFC. llvm-svn: 304974	2017-06-08 12:06:15 +00:00
Michael Kruse	ad7a1805be	[Simplify] Use execution order of memory accesses. Iterate through memory accesses in execution order (first all implicit reads, then explicit accesses, then implicit writes). In the test case this caused an implicit load to be handled as if it was loaded after the write. That is, the value being written before it is available. This fixes llvm.org/PR33323 llvm-svn: 304810	2017-06-06 17:46:42 +00:00
Eli Friedman	de1b318dad	Add opt-bisect support to polly. This is useful for debugging miscompiles and extracting testcases for crashes. See http://llvm.org/docs/OptBisect.html . Differential Revision: https://reviews.llvm.org/D33752 llvm-svn: 304480	2017-06-01 21:29:05 +00:00
Michael Kruse	5f16986271	[DeLICM] Partial writes for PHIs. Enable the use for partial writes for PHI write accesses with a switch. This simply skips the test for whether a PHI write would be partial. The analog test for partial value writes also protects for partial reads which we do not support (yet). It is possible to test for partial reads separately such that we could skip the partial write check as well. In case this shows up to be useful, I can implement it as well. Differential Revision: https://reviews.llvm.org/D33487 llvm-svn: 303762	2017-05-24 15:23:06 +00:00
Tobias Grosser	7205f93a98	[ScheduleOptimizer] Move schedule construction to isl C++ [NFC] llvm-svn: 303508	2017-05-21 16:21:33 +00:00
Tobias Grosser	b5f61bdeeb	[Simplify] Move to isl C++ llvm-svn: 303507	2017-05-21 16:12:21 +00:00
Tobias Grosser	443f6814a1	[isl++] Rebase isl C++ bindings on top of 29aee98ce This reduces the diff to the official isl C++ bindings and solves a correctness issue with isl::booleans, where isl_bool_error results were accidentally converted to isl::boolean::true. llvm-svn: 303505	2017-05-21 15:59:15 +00:00
Tobias Grosser	3320485961	[isl++] Move isl raw_ostream printers into separate header Instead of relying on these functions to be part of the isl C++ bindings, we just define this functionality independently. This allows us to use isl C++ bindings that do not contain LLVM specific functionality. llvm-svn: 303503	2017-05-21 13:16:05 +00:00
Siddharth Bhat	9746f817ea	[Simplify] Fix r302986 that introduced non-inferrable templates. - auto + decltype + template use was not inferrable in `Transform/Simplify.cpp accessesInOrder`. - changed code to explicitly construct required vector instead of using higher order iterator helpers. - Failing compiler spec: Apple LLVM version 7.3.0 (clang-703.0.31) Target: x86_64-apple-darwin15.6.0 llvm-svn: 303039	2017-05-15 08:18:51 +00:00
Tobias Grosser	497fdd7dff	[Simplify] Remove some leftover dead code llvm-svn: 303007	2017-05-14 09:20:56 +00:00
Michael Kruse	fa7be88378	[Simplify] Remove identical write removal. NFC. Removal of overwritten writes currently encompasses all the cases of the identical write removal. There is an observable behavioral change in that the last, instead of the first, MemoryAccess is kept. This should not affect the generated code, however. Differential Revision: https://reviews.llvm.org/D33143 llvm-svn: 302987	2017-05-13 12:20:57 +00:00
Michael Kruse	f263610b82	[Simplify] Remove writes that are overwritten. Remove memory writes that are overwritten by later writes. This works for StoreInsts: store double 21.0, double* %A store double 42.0, double* %A scalar writes at the end of a statement and mixes of these. Multiple writes can be the result of DeLICM, which might map multiple writes to the same location when it knows that these do no conflict (for instance because they write the same value). Such writes interfere with pattern-matched optimization such as gemm and may not get removed by other LLVM passes after code generation. Differential Revision: https://reviews.llvm.org/D33142 llvm-svn: 302986	2017-05-13 11:49:34 +00:00
Michael Kruse	aeb4864090	[Simplify] Reset all stats between runs. llvm-svn: 302926	2017-05-12 17:23:07 +00:00
Michael Kruse	d644ec7647	[DeLICM] Use input access heuristic for mapped PHI WRITEs. As with the scalar operand of the initial StoreInst, also use input accesses when searching for new opportunities after mapping a PHI write. The same rational applies here: After LICM has been applied, the promoted value will either be an instruction in the same statement (in which case we fall back to try every scalar access of the statement), or in another statement such that there will be such an input access. In the latter case other scalars cannot have originated from the same register promotion, at least not by LICM. This mostly helps to decrease compilation time and makes debugging easier by not pursuing unpromising routes. In some circumstances, it may change the compiler's output. llvm-svn: 302839	2017-05-11 22:56:59 +00:00
Michael Kruse	4c27643398	[DeLICM] Lookup input accesses. Previous to this patch, we used VirtualUse to determine the input access of an llvm::Value in a statement. The input access is the READ MemoryAccess that makes a value available in that statement, which can either be a READ of a MemoryKind::Value or the MemoryKind::PHI for a PHINode in the statement. DeLICM uses the input access to heuristically find a candidate to map without searching all possible values. This might modify the behaviour in that previously PHI accesses were not considered input accesses before. This was unintentially lost when "VirtualUse" was extracted from the "Known Knowledge" patch. llvm-svn: 302838	2017-05-11 22:56:46 +00:00
Michael Kruse	07e315e780	[Simplify] Remove identical scalar writes. After DeLICM, it is possible to have two writes of the same value to the same location in the same statement when it determined that those writes do not conflict (write the same value). Teach -polly-simplify to remove one of the writes. It interferes with the pattern matching of matrix-multiplication kernels and also seem to not be optimized away by LLVM. The algorthm is simple, has O(n^2) behaviour (n = max number of MemoryAccesses in a statement) and only matches the most obvious cases, but seem to be enough to pattern-match Boost ublas gemm. Not handled cases include: - StoreInst instructions (a.k.a. explicit writes), since the value might be loaded or overwritten between the two stores. - PHINode, especially LCSSA, when the PHI value matches with on other's. - Partial writes (in preparation) llvm-svn: 302805	2017-05-11 15:07:38 +00:00
Michael Kruse	a0987b83d5	[Simplify] Mark variables as used. NFC. Mark one more variable as used that is needed in assertions. llvm-svn: 302726	2017-05-10 20:45:10 +00:00
Michael Kruse	4aac59cee1	[Simplify] Mark variables as used. NFC. Mark variables as used that are needed in assertions. llvm-svn: 302725	2017-05-10 20:42:02 +00:00
Michael Kruse	f41f274bf8	[DeLICM] Avoid compiler warning. NFC. gcc 5.4 warns about using a C-style case to case away a const. Use case a const_cast instead. llvm-svn: 302715	2017-05-10 19:58:52 +00:00
Michael Kruse	f69a7c306b	[DeLICM] Always normalize domain. NFC. Some isl functions can simplify their __isl_keep arguments. The argument object after the call uses different contraints to represent the same set. Different contraints can result in different outputs when printed to a string. In assert builds additional isl functions are called (in assert() or mentioned, these can change the internal representation of its read-only arguments such that printed strings are different in debug and non-debug builds. What happened here is that a call to isl_set_is_equal inside an assert in getScatterFor normalizes one of its arguments such that one redundant constraint is removed. The redundant constraint therefore does not appear in the string representing the domain, which FileCheck notices as a regression test failure compared to a build with assertions disabled. This fix removes the redundant contraints the domain from the start such that the redundant contraint is removed in assert and non-assert builds. Isl adds a flag to such sets such that the removal of redundancies is not done multiple times (here: by isl_set_is_equal). Thanks to Tobias Grosser for reporting and hinting to the cause. llvm-svn: 302711	2017-05-10 19:50:45 +00:00
Tobias Grosser	f3adab4c20	[Polly] Canonicalize arrays according to base-ptr equivalence class Summary: In case two arrays share base pointers in the same invariant load equivalence class, we canonicalize all memory accesses to the first of these arrays (according to their order in the equivalence class). This enables us to optimize kernels such as boost::ublas by ensuring that different references to the C array are interpreted as accesses to the same array. Before this change the runtime alias check for ublas would fail, as it would assume models of the C array with differing (but identically valued) base pointers would reference distinct regions of memory whereas the referenced memory regions were indeed identical. As part of this change we remove most of the MemoryAccess::getBaseAddr interface. We removed already all references to getBaseAddr in previous commits to ensure that no code relies on matching base pointers between memory accesses and scop arrays -- except for three remaining uses where we need the original base pointer. We document for these situations that MemoryAccess::getOriginalBaseAddr may return a base pointer that is distinct to the base pointer of the scop array referenced by this memory access. Reviewers: sebpop, Meinersbur, zinob, gareevroman, pollydev, huihuiz, efriedma, jdoerfert Reviewed By: Meinersbur Subscribers: etherzhhb Tags: #polly Differential Revision: https://reviews.llvm.org/D28518 llvm-svn: 302636	2017-05-10 10:59:58 +00:00
Michael Kruse	5ae08c0ebb	[DeLICM] Known knowledge. Extend the Knowledge class to store information about the contents of array elements and which values are written. Two knowledges do not conflict the known content is the same. The content information if computed from writes to and loads from the array elements, and represented by "ValInst": isl spaces that compare equal if the value represented is the same. Differential Revision: https://reviews.llvm.org/D31247 llvm-svn: 302339	2017-05-06 14:03:58 +00:00
Michael Kruse	3e519b949b	[DeLICM] Use Known information when comparing Occupied and Written. Do not conflict if a write writes the same value as already known. This change only affects unit tests, but no functional changes are expected on LLVM-IR, as no Known information is yet extracted and consequently this functionality is only triggered through unit tests. Differential Revision: https://reviews.llvm.org/D32026 llvm-svn: 301460	2017-04-26 20:35:07 +00:00
Michael Kruse	cd2be66bf0	[DeLICM] Use Known information when comparing Existing.Occupied and Proposed.Occupied. Do not conflict if the value of Existing and Proposed are the same. This change only affects unit tests, but no functional changes are expected on LLVM-IR, as no Known information is yet extracted and consequently this functionality is only triggered through unit tests. Differential Revision: https://reviews.llvm.org/D32025 llvm-svn: 301301	2017-04-25 10:57:32 +00:00
Michael Kruse	8431e996d3	[DeLICM] Use Known information when comparing Existing.Written and Proposed.Written. This change only affects unit tests, but no functional changes are expected on LLVM-IR, as no Known information is yet extracted and consequently this functionality is only triggered through unit tests. Differential Revision: https://reviews.llvm.org/D32027 llvm-svn: 300874	2017-04-20 19:16:39 +00:00
Tobias Grosser	1f8b84094f	Update isl bindings to latest version (+ Polly extensions) After the isl C++ binding generator is now close to being upstreamed to isl, we synchronize the latest changes to Polly. These are mostly formatting changes plus a small interface change for the foreach callback function and some naming changes in isl::boolean. llvm-svn: 300398	2017-04-15 08:15:54 +00:00
Tobias Grosser	75aa1a9a49	Use isl C++ foreach implementation This commit switches Polly over to the isl::obj::foreach_* implementation, which is part of the new isl bindings and follows the foreach pattern established in Polly by Michael Kruse. The original isl C function: isl_stat isl_union_set_foreach_set(__isl_keep isl_union_set uset, isl_stat (fn)(__isl_take isl_set set, void user), void user); which required the user to define a static callback function to which all interesting parameters are passed via a 'void ' user-pointer, is on the C++ side available as a function that takes a std::function<>, which can carry any additional arguments without the need for a user pointer: stat UnionSet::foreach_set(const std::function<stat(set)> &fn) const; The following code illustrates the use of the new C++ interface: auto Lambda = [=, &Result](isl::set Set) -> isl::stat { auto Shifted = shiftDimension(Set, Pos, Amount); Result = Result.add(Shifted); return isl::stat::ok; } UnionSet.foreach_set(Lambda); Polly had some specialized foreach functions which did not require the lambdas to return a status flag. We remove these functions in this commit to move Polly completely over to the new isl interface. We may in the future discuss if functors without return values can be supported easily. Another extension proposed by Michael Kruse is the use of C++ iterators to allow the use of normal for loops to iterate over these sets. Such an extension would allow us to further simplify the code. Reviewed-by: Michael Kruse <llvm@meinersbur.de> Differential Revision: https://reviews.llvm.org/D30620 llvm-svn: 300323	2017-04-14 13:39:40 +00:00
Michael Kruse	72f3922534	[DeLICM] Export Known and Written to DeLICMTests. NFC. This will allow unittesting of new functionality based on Known and Written. llvm-svn: 300211	2017-04-13 16:32:39 +00:00
Michael Kruse	a2acc11949	[DeLICM] Add Knowledge::Known. NFC. This field will later contain a ValInst that is known to be stored in an occupied array element. llvm-svn: 300210	2017-04-13 16:32:31 +00:00
Michael Kruse	fa7c8cdfc6	[DeLICM] Make Knowledge::Written an isl::union_map. NFC. The map will later point to a ValInst that is written. llvm-svn: 300208	2017-04-13 16:32:25 +00:00
Tobias Grosser	7b5a4dfd46	Exploit BasicBlock::getModule to shorten code Suggested-by: Roman Gareev <gareevroman@gmail.com> llvm-svn: 299914	2017-04-11 04:59:13 +00:00
Roman Gareev	9d4d91ca6a	[FIX] Fix ScheduleTreeOptimizer::optimizeMatMulPattern Use new values of the dimensions during their permutation. llvm-svn: 299663	2017-04-06 17:25:08 +00:00
Roman Gareev	e0d466342b	Restore the initial ordering of dimensions before applying the pattern matching Dimensions of band nodes can be implicitly permuted by the algorithm applied during the schedule generation. For example, in case of the following matrix-matrix multiplication, for (i = 0; i < 1024; i++) for (k = 0; k < 1024; k++) for (j = 0; j < 1024; j++) C[i][j] += A[i][k] * B[k][j]; it can produce the following schedule tree domain: "{ Stmt_for_body6[i0, i1, i2] : 0 <= i0 <= 1023 and 0 <= i1 <= 1023 and 0 <= i2 <= 1023 }" child: schedule: "[{ Stmt_for_body6[i0, i1, i2] -> [(i0)] }, { Stmt_for_body6[i0, i1, i2] -> [(i1)] }, { Stmt_for_body6[i0, i1, i2] -> [(i2)] }]" permutable: 1 coincident: [ 1, 1, 0 ] The current implementation of the pattern matching optimizations relies on the initial ordering of dimensions. Otherwise, it can produce the miscompilation (e.g., [1]). This patch helps to restore the initial ordering of dimensions by recreating the band node when the corresponding conditions are satisfied. Refs.: [1] - https://bugs.llvm.org/show_bug.cgi?id=32500 Reviewed-by: Michael Kruse <llvm@meinersbur.de> Differential Revision: https://reviews.llvm.org/D31741 llvm-svn: 299662	2017-04-06 17:09:54 +00:00

1 2 3 4 5 ...

329 Commits