llvm-project

Commit Graph

Author	SHA1	Message	Date
Tobias Grosser	27db02b247	[ScopInfo] Move ScopArrayInfo::ScopArrayInfo to isl++ [NFC] llvm-svn: 310211	2017-08-06 17:25:05 +00:00
Tobias Grosser	85048eff1a	[ScopInfo] Move ScopStmt::ScopStmt to isl++ [NFC] llvm-svn: 310210	2017-08-06 17:24:59 +00:00
Tobias Grosser	dcf8d696ff	Move ScopInfo::getDomain(), getDomainSpace(), getDomainId() to isl++ llvm-svn: 310209	2017-08-06 16:39:52 +00:00
Tobias Grosser	a9b5bbac78	Move ScopStmt::Domain to isl++ llvm-svn: 310207	2017-08-06 16:11:53 +00:00
Tobias Grosser	cb0224ad59	Update to a newer version of isl++ llvm-svn: 310206	2017-08-06 15:56:45 +00:00
Tobias Grosser	8b40f8c6c7	Update to isl-0.18-812-g565da6e This update is mostly a maintenance update, but also exposes a couple of new functions that will be needed for the next version of the isl++ bindings. llvm-svn: 310205	2017-08-06 15:51:16 +00:00
Tobias Grosser	bfee458d0f	[Scopinfo] Fix memory corruption issue that sneaked into the previous commit llvm-svn: 310204	2017-08-06 15:47:04 +00:00
Tobias Grosser	2332fa3604	[ScopInfo] Move InvalidDomain to isl++ [NFC] llvm-svn: 310203	2017-08-06 15:36:48 +00:00
Tobias Grosser	2b7479b1af	[Polly] Fix for the JSON Exporter Summary: Small patch to fix the JSON exporter. Currently, using "opt -polly-export-jscop" does not generate jscop files, but gives an error: * Error in `opt': corrupted double-linked list: 0x0000000000bc4bb0 * Updated the function getAccessRelationStr() to work with the current version of getAccessRelation(), fixing the JSON exporter Reviewers: bollu, grosser Reviewed By: grosser Subscribers: grosser, llvm-commits, pollydev Tags: #polly Differential Revision: https://reviews.llvm.org/D36370 llvm-svn: 310199	2017-08-06 11:41:10 +00:00
Tobias Grosser	b99c11710c	[GPGPU] Make sure managed arrays are prepared at the beginning of the scop Summary: This resolves some "instruction does not dominate use" errors, as we used to prepare the arrays at the location of the first kernel, which not necessarily dominated all other kernel calls. Reviewers: Meinersbur, bollu, singam-sanjay Subscribers: nemanjai, pollydev, llvm-commits, kbarton Differential Revision: https://reviews.llvm.org/D36372 llvm-svn: 310196	2017-08-06 11:10:38 +00:00
Tobias Grosser	5b307cdb8a	[GPGPU] Rename all, not only the first libdevice function llvm-svn: 310194	2017-08-06 03:04:15 +00:00
Siddharth Bhat	e53c924b0f	[Polly] [PPCGCodeGeneration] Deal with loops outside the Scop correctly in PPCGCodeGeneration. A Scop with a loop outside it is not handled currently by PPCGCodeGeneration. The test case is such that the Scop has only one inner loop that is detected. This currently breaks codegen. The fix is to reuse the existing mechanism in `IslNodeBuilder` within `GPUNodeBuilder. Differential Revision: https://reviews.llvm.org/D36290 llvm-svn: 310193	2017-08-06 02:39:05 +00:00
Siddharth Bhat	0caed1fbe6	[IslNodeBuilder] [NFC] Refactor creation of loop induction variables of loops outside scops. This logic is duplicated, so we refactor it into a separate function. This will be used in a later patch to teach PPCGCodeGen code generation for loops that are outside the scop. Differential Revision: https://reviews.llvm.org/D36310 llvm-svn: 310192	2017-08-06 02:07:11 +00:00
Tobias Grosser	feae3dfe9f	[unittests] Add unittest for getPartialTilePrefixes In https://reviews.llvm.org/D36278 it was pointed out that the behavior of getPartialTilePrefixes is not very well understood. To allow for a better understanding, we first provide some basic unittests. llvm-svn: 310175	2017-08-05 09:38:09 +00:00
Michael Kruse	138a3fbae1	[DeLICM] Refactor ZoneAlgorithm into ZoneAlgo.cpp. NFC. Extract ZoneAlgorithm from DeLICM.cpp into its own file. It will gain a second use by the load forwarding part of -polly-optree. llvm-svn: 310146	2017-08-04 22:51:23 +00:00
Siddharth Bhat	638316da5b	[PPCGCodeGeneration] [NFC] Log every location from which PPCGCodegen bails. This is useful when trying to understand why no GPU code was produced. Differential Revision: https://reviews.llvm.org/D36318 llvm-svn: 310103	2017-08-04 19:36:40 +00:00
Michael Kruse	a9a7086319	[ForwardOpTree] Refactor out forwardSpeculatable(). NFC. The method forwardSpeculatable forwards speculatively executable instructions and is currently the only way to forward an instruction. In the future we intend to add more methods. llvm-svn: 310056	2017-08-04 12:28:42 +00:00
Philip Pfaffe	96d2143f20	[PM] Make the new-pm passes behave more like the legacy passes Summary: Testing the new-pm passes becomes much easier once they behave more like the old passes in terms of the order in which Scops are processed and printed. This requires three changes: - ScopInfo: Use an ordered map to store scops - ScopInfo: Iterate and print Scops in reverse order to match legacy PM behaviour - ScopDetection: print function name in ScopAnalysisPrinter Reviewers: grosser, Meinersbur, bollu Reviewed By: grosser Subscribers: pollydev, llvm-commits Differential Revision: https://reviews.llvm.org/D36303 llvm-svn: 310052	2017-08-04 11:28:51 +00:00
Michael Kruse	1046aa3148	[VirtualInstruction] Handle MetadataAsValue as constant. The complication of bspatch.cc of the AOSP buildbot currently fails presumably because the occurance of a MetadataAsValue in an operand. This kind of value can occur as operands of intrinsics, the typical example being the debug intrinsics. Polly currently ignores the debug intrinsics and it is not yet clear which other intrinic might occur. For such cases, and to unbreak the AOSP buildbot, treat a MetadataAsValue as a constant because it can be referenced without modification in generated code. llvm-svn: 309992	2017-08-03 22:00:01 +00:00
Michael Kruse	672c011460	[VirtualInstruction] Avoid use of getStmtFor(BB). NFC. With this patch, we get rid of the last use of getStmtFor(BB). Here this is done by getting the last statement of the incoming block in case the user is a phi node; otherwise just fetching the statement comprising the instruction for which the virtual use is being created. Differential Revision: https://reviews.llvm.org/D36268 llvm-svn: 309947	2017-08-03 15:27:00 +00:00
Tobias Grosser	b5563c6817	Make sure that all parameter dimensions are set in schedule Summary: In case the option -polly-ignore-parameter-bounds is set, not all parameters will be added to context and domains. This is useful to keep the size of the sets and maps we work with small. Unfortunately, for AST generation it is necessary to ensure all parameters are part of the schedule tree. Hence, we modify the GPGPU code generation to make sure this is the case. To obtain the necessary information we expose a new function Scop::getFullParamSpace(). We also make a couple of functions const to be able to make SCoP::getFullParamSpace() const. Reviewers: Meinersbur, bollu, gareevroman, efriedma, huihuiz, sebpop, simbuerg Subscribers: nemanjai, kbarton, pollydev, llvm-commits Tags: #polly Differential Revision: https://reviews.llvm.org/D36243 llvm-svn: 309939	2017-08-03 13:51:15 +00:00
Siddharth Bhat	eadf76d34a	[PPCGCodeGeneration] Construct `isl_multi_pw_aff` of PPCGArray.bounds even when polly-ignore-parameter-bounds is turned on. When we have `-polly-ignore-parameter-bounds`, `Scop::Context` does not contain all the paramters present in the program. The construction of the `isl_multi_pw_aff` requires all the indivisual `pw_aff` to have the same parameter dimensions. To achieve this, we used to realign every `pw_aff` with `Scop::Context`. However, in conjunction with `-polly-ignore-parameter-bounds`, this is now incorrect, since `Scop::Context` does not contain all parameters. We set this up correctly by creating a space that has all the parameters used by all the `isl_pw_aff`. Then, we realign all `isl_pw_aff` to this space. llvm-svn: 309934	2017-08-03 12:09:33 +00:00
Tobias Grosser	a195576118	Enable simplify and forward-op-tree by default These passes have been tested over the last month and should generally help to remove scalar data dependences in Polly. We enable them to give them even wider test coverage. Large performance regressions and any kind of correctness regressions are not expected. llvm-svn: 309878	2017-08-02 20:12:27 +00:00
Tobias Grosser	7b45af13ce	Move setNewAccessRelation to isl++ llvm-svn: 309871	2017-08-02 19:27:25 +00:00
Tobias Grosser	6d58804cc2	Move ScopStmt::setAccessRelation to isl++ llvm-svn: 309870	2017-08-02 19:27:16 +00:00
Tobias Grosser	18ca9e5119	Replace asserts with llvm_unreachable to clarify intent llvm-svn: 309856	2017-08-02 19:11:46 +00:00
Philip Pfaffe	33aef072c1	Fix r309826: Appease clang-format check. llvm-svn: 309853	2017-08-02 18:26:48 +00:00
Singapuram Sanjay Srivallabh	1f9ab16c4e	Fix code format on r309826 Summary: Fix code format on r309826 / D35458 Reviewers: grosser, bollu Reviewed By: grosser Subscribers: pollydev Tags: #polly Differential Revision: https://reviews.llvm.org/D36232 llvm-svn: 309845	2017-08-02 17:56:39 +00:00
Philip Pfaffe	8f1872fb27	Fix r309826: Move intantiation and specialization of OwningScopAnalysisManagerFunctionProxy to the polly namespace. When compiling with clang, explicit instantiation of the OwningScopAnalysisManagerFunctionProxy needs to happen within the polly namespace. Same goes with the specialization of its run method. llvm-svn: 309835	2017-08-02 17:25:45 +00:00
Philip Pfaffe	a70e2649ab	[Polly][PM][WIP] Polly pass registration Summary: This patch is a first attempt at registering Polly passes with the LLVM tools. Tool plugins are still unsupported, but this registration is usable from the tools if Polly is linked into them (albeit requiring minimal patches to those tools). Registration requires a small amount of machinery (the owning analysis proxies), necessary for injecting ScopAnalysisManager objects into the calling tools. This patch is marked WIP because the registration is incomplete. Parsing manual pipelines is fully supported, but default pass injection into the O3 pipeline is lacking, mostly because there is opportunity for some redesign here, I believe. The first point of order would be insertion points. I think it makes sense to run before the vectorizers. Running Polly Early, however, is weird. Mostly because it actually is the default (which to me is unexpected), and because Polly runs it's own O1 pipeline. Why not instead insert it at an appropriate place somewhere after simplification happend? Running after the loop optimizers seems intuitive, but it also seems wasteful, since multiple consecutive loops might well be a single scop, and we don't need to run for all of them. My second request for comments would be regarding all those smallish helper passes we have, like PollyViewer, PollyPrinter, PollyImportJScop. Right now these are controlled by command line options, deciding whether they should be part of the Polly pipeline. What is your opinion on treating them like real passes, and have the user write an appropriate pipeline if they want to use any of them? Reviewers: grosser, Meinersbur, bollu Reviewed By: grosser Subscribers: llvm-commits, pollydev Tags: #polly Differential Revision: https://reviews.llvm.org/D35458 llvm-svn: 309826	2017-08-02 15:52:25 +00:00
Singapuram Sanjay Srivallabh	188053af5e	Remove debug metadata from copied instruction to prevent GPUModule verification failure Summary: Remove debug metadata from instruction to be copied to prevent the source file's debug metadata being copied into GPUModule and eventually failing Module verification and ASM string codegeneration. When copying the instruction onto the Module meant for the GPU, debug metadata attached to an instruction causes all related metadata to be pulled into the Module, including the DICompileUnit, which is not listed in llvm.dbg.cu of the Module. This fails the verification of the Module and generation of the ASM string. The only debug metadata of the instruction, the DebugLoc, is unset by this patch. This patch reattempts https://reviews.llvm.org/D35630 by targeting only those instructions that are to end up in a Module meant for the GPU. Reviewers: grosser, bollu Reviewed By: grosser Subscribers: pollydev Tags: #polly Differential Revision: https://reviews.llvm.org/D36161 llvm-svn: 309822	2017-08-02 15:20:07 +00:00
Philip Pfaffe	f081ec7609	[PM] Fix proxy invalidation Summary: I made a mistake in handling transitive invalidation of analysis results. I've updated the list of preserved analyses as well as the correct result dependences. The Invalidator passed through the invalidate() path can be used to transitively invalidate analyses. It frequently happens that analysis results depend on other analyses, and thus store references to their results. When the dependee now gets invalidated, the depender needs to be invalidated as well. This is the purpose of the Invalidator object, which can be used to check whether some dependee analysis is in the process of being invalidated. I originally was checking the wrong dependee analyses, which is an actual error, you can only check analysis results that are in the cache (which they are if you've captured their reference). The invalidation I'm handling inside the proxy deals with the standard analyses the proxy passes into the Scop pipeline, since I'm capturing their reference. This checking allows us to actually preserve a couple of results outside of the proxy, since the Scop pipeline shouldn't break those, or otherwise should update them accordingly. Reviewers: grosser, Meinersbur, bollu Reviewed By: grosser Subscribers: pollydev, llvm-commits Differential Revision: https://reviews.llvm.org/D36216 llvm-svn: 309811	2017-08-02 13:18:49 +00:00
Philip Pfaffe	ead67dbbd6	[SI][NewPM] Collect loop count statistics llvm-svn: 309807	2017-08-02 11:14:41 +00:00
Philip Pfaffe	f5a4394ad6	[SD] Set PollyUseRuntimeAliasChecks correctly llvm-svn: 309805	2017-08-02 11:08:01 +00:00
Michael Kruse	fd35089689	[ForwardOpTree] Execute canForwardTree also in release builds. Commit r309730 moved the call to canForwardTree into an assert(), even though this function has side-effects if its DoIt parameter is true. To avoid a warning in release builds, do an (void)Execution of its result instead. To avoid such confusion in the future, rename canForwardTree() to forwardTree(). llvm-svn: 309753	2017-08-01 22:15:04 +00:00
Michael Kruse	bc88a78cb4	[Simplify] Rewrite redundant write detection algorithm. The previous algorithm was to search a writes and the sours of its value operand, and see whether the write just stores the same read value back, which includes a search whether there is another write access between them. This is O(n^2) in the max number of accesses in a statement (+ the complexity of isl comparing the access functions). The new algorithm is more similar to the one used for searching for overwrites and coalescable writes. It scans over all accesses in order of execution while tracking which array elements still have the same value since it was read. This is O(n), not counting the complexity within isl. It should be more reliable than trying to catch all non-conforming cases in the previous approach. It is also less code. We now also support if the write is a partial write of the read's domain, and to some extent non-affine subregions. Differential Revision: https://reviews.llvm.org/D36137 llvm-svn: 309734	2017-08-01 20:01:34 +00:00
Reid Kleckner	859c1e606a	Silence -Wunused-variable warning in NDEBUG builds llvm-svn: 309730	2017-08-01 19:53:01 +00:00
Michael Kruse	693ef99935	[Simplify] Improve scalability. With a lot of reads and writes to the same array in a statement, some isl sets that capture the state between access can become complex such that isl takes more considerable time and memory for operations on them. The problems identified were: - is_subset() takes considerable time with many disjoints in the arguments. We limit the number of disjoints to 4, any additional information is thrown away. - subtract() can lead to many disjoints. We instead assume that any array element is possibly accessed, which removes all disjoints. - subtract_domain() may lead to considerable processing, even if all elements are are to be removed. Instead, we remove determine and remove the affected spaces manually. No behaviour is changed. llvm-svn: 309728	2017-08-01 19:39:11 +00:00
Tobias Grosser	e327eebccb	Update to isl-0.18-809-gd5b4535 This fixes some undefined behavior in the isl schedule tree code. llvm-svn: 309727	2017-08-01 19:37:50 +00:00
Siddharth Bhat	edf9581e4c	[PPCGCodeGeneration] Correct usage of llvm::Value with getLatestValue. It is possible that the `HostPtr` that coresponds to an array could be invariant load hoisted. Make sure we use the invariant load hoisted value by using `IslNodeBuilder::getLatestValue`. Differential Revision: https://reviews.llvm.org/D36001 llvm-svn: 309681	2017-08-01 14:26:39 +00:00
Siddharth Bhat	f2cfd2a4db	[NFC] [IslNodeBuilder, GPUNodeBuilder] Unify mechanism for looking up replacement Values. We populate `IslNodeBuilder::ValueMap` which contains replacements for `llvm::Value`s. There was no simple method to pick up a replacement if it exists, otherwise fall back to the original. Create a method `IslNodeBuilder::getLatestValue` which provides this functionality. This will be used in a later patch to fix bugs in `PPCGCodeGeneration` where the latest value is not being used. Differential Revision: https://reviews.llvm.org/D36000 llvm-svn: 309674	2017-08-01 12:15:51 +00:00
Siddharth Bhat	4d5820d171	[NFC] [PPCGCodeGeneration] Convert GPUNodeBuilder::getGridSizes to isl++. llvm-svn: 309671	2017-08-01 10:45:41 +00:00
Siddharth Bhat	ccbf4b509c	[NFC] [PPCGCodeGeneration] Convert GPUNodeBuilder::getArrayOffset to isl++. llvm-svn: 309669	2017-08-01 09:58:55 +00:00
Michael Kruse	9f6e41cdba	[ForwardOpTree] Support synthesizable values. This allows -polly-optree to move instructions that depend on synthesizable values. The difficulty for synthesizable values is that their value depends on the location. When it is moved over a loop header, and the SCEV expression depends on the loop induction variable (SCEVAddRecExpr), it would use the current induction variable instead of the last one. At the moment we cannot forward PHI nodes such that crossing the header of loops referenced by SCEVAddRecExpr is not possible (assuming the loop header has at least two incoming blocks: for entering the loop and the backedge, such any instruction to be forwarded must have a phi between use and definition). A remaining issue is when the forwarded value is used after the loop, but is only synthesizable inside the loop. This happens e.g. if ScalarEvolution is unable to determine the number of loop iterations or the initial loop value. We do not forward in this situation. Differential Revision: https://reviews.llvm.org/D36102 llvm-svn: 309609	2017-07-31 19:46:21 +00:00
Michael Kruse	57cc92b790	[Simplify] Remove all kinds of redundant scalar writes. In addition to array and PHI writes, also allow scalar value writes. The only kind of write not allowed are writes by functions (including memcpy/memmove/memset). llvm-svn: 309582	2017-07-31 17:04:55 +00:00
Tobias Grosser	8fc6cdfb1c	[GPGPU] Add support for NVIDIA libdevice Summary: This allows us to map functions such as exp, expf, expl, for which no LLVM intrinsics exist. Instead, we link to NVIDIA's libdevice which provides high-performance implementations of a wide range of (math) functions. We currently link only a small subset, the exp, cos and copysign functions. Other functions will be enabled as needed. Reviewers: bollu, singam-sanjay Reviewed By: bollu Subscribers: tstellar, tra, nemanjai, pollydev, mgorny, llvm-commits, kbarton Tags: #polly Differential Revision: https://reviews.llvm.org/D35703 llvm-svn: 309560	2017-07-31 14:03:16 +00:00
Tobias Grosser	39977e4e76	Revert "Remove Debug metadata from copied instruction to prevent Module verification failure" This reverts commit r309490 as it triggers on our AOSP buildbut error messages of the form: inlinable function call in a function with debug info must have a !dbg location llvm-svn: 309556	2017-07-31 11:43:38 +00:00
Tobias Grosser	7639db8ed9	[IslNodeBuilder] Remove unused instruction Suggested-by: Maximilian Falkenstein <falkensm@student.ethz.ch> llvm-svn: 309533	2017-07-31 01:59:23 +00:00
Singapuram Sanjay Srivallabh	cf9a813368	Remove Debug metadata from copied instruction to prevent Module verification failure Summary: Remove debug metadata from instruction to be copied to prevent the source file's debug metadata being copied into GPUModule and eventually failing Module verification and ASM string codegeneration. When copying the instruction onto the Module meant for the GPU, debug metadata attached to an instruction causes all related metadata to be pulled into the Module, including the DICompileUnit, which is not listed in llvm.dbg.cu of the Module. This fails the verification of the Module and generation of the ASM string. The only debug metadata of the instruction, the DebugLoc, is unset by this patch. Reviewers: grosser, bollu, Meinersbur Reviewed By: grosser, bollu Subscribers: pollydev Tags: #polly Differential Revision: https://reviews.llvm.org/D35630 llvm-svn: 309490	2017-07-29 18:03:49 +00:00
Michael Kruse	ce9617f4fe	[Simplify] Implement write accesses coalescing. Write coalescing combines write accesses that - Write the same llvm::Value. - Write to the same array. - Unless they do not write anything in a statement instance (partial writes), write to the same element. - There is no other access between them that accesses the same element. This is particularly useful after DeLICM, which leaves partial writes to disjoint domains. Differential Revision: https://reviews.llvm.org/D36010 llvm-svn: 309489	2017-07-29 16:21:16 +00:00
Michael Kruse	8e41d2baab	[Simplify] Do not remove dependencies of phis within region stmts. These were wrongly assumed to be phi nodes that require MemoryKind::PHI accesses. llvm-svn: 309454	2017-07-28 23:22:32 +00:00
Michael Kruse	fd7f40961b	[VirtualInstruction] Do not iterate over a region statement's instruction list. NFC. It should be empty anyways. In this case it would even be redundant because we just all all instructions in region statements. llvm-svn: 309453	2017-07-28 23:22:23 +00:00
Michael Kruse	6c8f91b908	[Simplify] Fix typo in statistics output. NFC. llvm-svn: 309402	2017-07-28 16:57:51 +00:00
Michael Kruse	34a77780c5	[Simplify] Remove empty partial accesses first. NFC. So follow-up cleanup do not need special handling for such accesses. llvm-svn: 309401	2017-07-28 16:57:45 +00:00
Siddharth Bhat	4ebeb3568a	[PPCGCodeGeneration] Check that invariant load hoisting succeeded. If we fail, throw an error for now. We can gracefully handle this later. llvm-svn: 309387	2017-07-28 14:48:32 +00:00
Siddharth Bhat	0a1177b58e	[ScopDetect] add `-polly-ignore-func` flag to ignore functions by name. Ignore all functions whose name match a regex. Useful because creating a regex that does not match a string is somewhat hard. Example: https://stackoverflow.com/questions/1240275/how-to-negate-specific-word-in-regex llvm-svn: 309377	2017-07-28 11:47:24 +00:00
Tobias Grosser	25271b91b2	[GPGPU] Do not require the Scop::Context to have information about all parameters llvm-svn: 309368	2017-07-28 06:49:44 +00:00
Tobias Grosser	30caae6d23	[GPGPU] Fix compilation issue with latest CUDA upgrade to i128 llvm-svn: 309366	2017-07-28 06:38:49 +00:00
Tobias Grosser	adcbee5433	Update isl to isl-0.18-800-g4018f45 This fixes a bug in isl_flow where triggering the compute out could result in undefined or unexpected behavior. This fixes some recent regressions we saw in the android buildbots. Thanks Eli Friedman for reducing the corresponding test cases. llvm-svn: 309274	2017-07-27 14:48:02 +00:00
Michael Kruse	a508a4e619	[ScopBuilder/Simplify] Refactor isEscaping. NFC. ScopBuilder and Simplify (through VirtualInstruction.cpp) previously used this functionality in their own implementation. Refactor them both into a common one into the Scop class. BlockGenerator also makes use of a similiar functionality, but also records outside users and takes place after region simplification. Merging it as well would be more complicated. llvm-svn: 309273	2017-07-27 14:39:52 +00:00
Michael Kruse	8a8aca4299	[Simplify] Count PHINodes in simplifiable exit nodes as escaping use. After region exit simplification, the incoming block of a phi node in the SCoP region's exit block lands outside of the region. Since we treat SCoPs as if this already happened, we need to account for that when looking for outside uses of scalars (i.e. escaping scalars). llvm-svn: 309271	2017-07-27 14:09:31 +00:00
Michael Kruse	eca86cee64	[ScopInfo] Never print instruction list of region stmts. A region statement's instruction list is always empty and ignored by the code generator. Don't give the impression that it means anything. llvm-svn: 309197	2017-07-26 22:01:33 +00:00
Michael Kruse	cedd7a74e1	[Simplify] Do not setInstructions() of region stmts. NFC. The instruction list is ignored for region statements, there is no reason to set it. llvm-svn: 309196	2017-07-26 22:01:28 +00:00
Michael Kruse	95b39da8ae	[Simplify] Fix invalid removal write for escaping values. A PHI node's incoming block is the user of its operand, not the PHI's parent. Assuming the PHINode's parent being the user lead to the removal of a MemoryAccesses because its use was assumed to be inside of the SCoP. llvm-svn: 309164	2017-07-26 19:58:15 +00:00
Roman Gareev	2e580538be	[ScheduleOptimizer] Translate to C++ bindings Translate the ScheduleOptimizer to use the new isl C++ bindings. Reviewed-by: Michael Kruse <llvm@meinersbur.de> Differential Revision: https://reviews.llvm.org/D35845 llvm-svn: 309119	2017-07-26 14:59:15 +00:00
Michael Kruse	1df1aac014	[ScopInfo] Avoid use of getStmtFor(BB). NFC. Since there will be no more a 1:1 correspondence between statements and basic blocks, we would like to get rid of the method getStmtFor(BB) and its uses. Here we remove one of its uses in ScopInfo by fetching the statement in which the call instruction lies. Contributed-by: Nandini Singhal <cs15mtech01004@iith.ac.in> Differential Revision: https://reviews.llvm.org/D35691 llvm-svn: 309110	2017-07-26 13:25:28 +00:00
Michael Kruse	11ed062258	[SCEVValidator] Loop exit values of loops before the SCoP are synthesizable. In the following loop: int i; for (i = 0; i < func(); i+=1) ; SCoP: for (int j = 0; j<n; j+=1) S(i, j) The value i is synthesizable in the SCoP that includes only the j-loop. This is because i is fixed within the SCoP, it is irrelevant whether it originates from another loop. This fixes a strange case where a PHI was synthesiable in a SCoP, but not its incoming value, triggering an assertion. This should fix MultiSource/Applications/sgefa/sgefa of the perf-x86_64-penryn-O3-polly-before-vectorizer-unprofitable buildbot. llvm-svn: 309109	2017-07-26 13:05:45 +00:00
Tobias Grosser	9ddcf8e6ac	Revert accidental isl changes in 308923 It seems I still had some incomplete changes in the tree when committing. In general, we only import changes from isl upstream. In this case, the changes were especially unfortunate, as they broke the error management in isl_flow.c and consequently caused regressions. Thanks to Michael Kruse for spotting this mistake. llvm-svn: 309039	2017-07-25 22:15:47 +00:00
Michael Kruse	8d89179e33	[ScopInfo] Rename ScopStmt::contains(BB) to represents(BB). NFC. In future, there will be no more a 1:1 correspondence between statements and basic blocks, the name `contains` does not correctly capture their relationship. A BB may infact comprise of multiple statements; hence we describe a statement 'representing' a basic block. Differential Revision: https://reviews.llvm.org/D35838 llvm-svn: 308982	2017-07-25 16:25:37 +00:00
Siddharth Bhat	43f178bbc9	[PPCGCodeGeneration] Skip arrays with empty extent. Invariant load hoisted scalars, and arrays whose size we can statically compute to be 0 do not need to be allocated as arrays. Invariant load hoisted scalars are sent to the kernel directly as parameters. Earlier, we used to allocate `0` bytes of memory for these because our computation of size from `PPCGCodeGeneration::getArraySize` would result in `0`. Now, since we don't invariant loads as arrays in PPCGCodeGeneration, this problem does not occur anymore. Differential Revision: https://reviews.llvm.org/D35795 llvm-svn: 308971	2017-07-25 12:35:36 +00:00
Tobias Grosser	d7065e5df5	Move MemoryAccess::isStride* to isl++ llvm-svn: 308927	2017-07-24 20:50:22 +00:00
Tobias Grosser	b739cb42f5	Move MemoryAccess::InvalidDomain to isl++ llvm-svn: 308923	2017-07-24 20:30:34 +00:00
Tobias Grosser	cdf471baef	Move MemoryAccess::getPwAff to isl++ llvm-svn: 308895	2017-07-24 16:36:34 +00:00
Tobias Grosser	1f6ba7e238	Move MemoryAccess::MemoryAccess to isl++ llvm-svn: 308893	2017-07-24 16:22:32 +00:00
Tobias Grosser	206e9e3b3b	Move ScopArrayInfo::getFromAccessFunction and getFromId to isl++ llvm-svn: 308892	2017-07-24 16:22:27 +00:00
Michael Kruse	54071126d8	[ForwardOpTree] Properly indent enumeration in comment. NFC. llvm-svn: 308887	2017-07-24 15:34:03 +00:00
Michael Kruse	67752076bc	[ForwardOpTree] Rename FD_CanForward to FD_CanForwardLeaf. NFC. To make the meaning and distinction to FD_CanForwardTree clearer. llvm-svn: 308886	2017-07-24 15:33:58 +00:00
Michael Kruse	d85e345ce0	[ForwardOpTree] Add comments to ForwardingDecision items. NFC. In particular, explain the difference between FD_CanForward and FD_CanForwardTree. llvm-svn: 308885	2017-07-24 15:33:53 +00:00
Michael Kruse	07e8c36dc7	[ForwardOpTree] Support read-only value uses. Read-only values (values defined before the SCoP) require special handing with -polly-analyze-read-only-scalars=true (which is the default). If active, each use of a value requires a read access. When a copied value uses a read-only value, we must also ensure that such a MemoryAccess is available or is created. Differential Revision: https://reviews.llvm.org/D35764 llvm-svn: 308876	2017-07-24 12:43:27 +00:00
Siddharth Bhat	e2699b572e	[Polly] [NFC] [ScopDetection] Make `polly-only-func` perform regex scop name match. Summary: - We were using `.count` in `StringRef`, which matches substrings. - We may want to use this for equality as well. - Generalise this, so allow regexes as a parameter to `polly-only-func`. Differential Revision: https://reviews.llvm.org/D35728 llvm-svn: 308875	2017-07-24 12:40:52 +00:00
Michael Kruse	5b8a9095e8	[ForwardOpTree] Fix mixup in comment. NFC. The cases DoIt==false and DoIt==true were mixed up. Thanks to Siddharth for noticing. llvm-svn: 308874	2017-07-24 12:39:46 +00:00
Michael Kruse	25a688165b	[ScopInfo] Fix typo in method name. NFC. prependInstrunction -> prependInstruction Thanks Nandini for noticing. llvm-svn: 308873	2017-07-24 12:39:41 +00:00
Siddharth Bhat	f7face4bc4	Convert GPUNodeBuilder::getArraySize to islcpp. Note: PPCGCodeGeneration::pollyBuildAstExprForStmt is at https://reviews.llvm.org/D35770 Differential Revision: https://reviews.llvm.org/D35771 llvm-svn: 308870	2017-07-24 09:08:21 +00:00
Siddharth Bhat	35de900917	[NFC] Move PPCGCodeGeneration::pollyBuildAstExprForStmt to isl++. Differential Revision: https://reviews.llvm.org/D35771 llvm-svn: 308869	2017-07-24 08:34:24 +00:00
Tobias Grosser	325812ac6d	Simplify: Adopt for translation of MemoryAccess::getAccessRelation For some reason this one was missed earlier. llvm-svn: 308845	2017-07-23 08:15:28 +00:00
Tobias Grosser	1959dbda75	Move MemoryAccess::get*ArrayId to isl++ llvm-svn: 308843	2017-07-23 04:08:59 +00:00
Tobias Grosser	3b196131b5	Move applyScheduleToAccessRelation to isl++ llvm-svn: 308842	2017-07-23 04:08:52 +00:00
Tobias Grosser	6a87036e0f	Move MemoryAccess::getAddressFunction to isl++ llvm-svn: 308841	2017-07-23 04:08:45 +00:00
Tobias Grosser	1515f6b937	Move MemoryAccess::NewAccessRelation to isl++ We also move related accessor functions llvm-svn: 308840	2017-07-23 04:08:38 +00:00
Tobias Grosser	22da5f087a	Move MemoryAccess::getOriginalAccessRelation to isl++ llvm-svn: 308839	2017-07-23 04:08:27 +00:00
Tobias Grosser	0c4c2eef75	Move MemoryAccess::AccessRelation to isl++ llvm-svn: 308838	2017-07-23 04:08:22 +00:00
Tobias Grosser	b6e7a85a6d	Move MemoryAccess::createBasicAccessMap to isl++ llvm-svn: 308837	2017-07-23 04:08:17 +00:00
Tobias Grosser	fe46c3ff3a	Move MemoryAccess::id to isl++ llvm-svn: 308836	2017-07-23 04:08:11 +00:00
Michael Kruse	ab8f0d57df	[Simplify] Remove partial write accesses with empty domain. If the access relation's domain is empty, the access will never be executed. We can just remove it. We only remove write accesses. Partial read accesses are not yet supported and instructions in the statement might require the llvm::Value holding the read's result to be defined. llvm-svn: 308830	2017-07-22 20:33:09 +00:00
Michael Kruse	e52ebd1ae4	[ScopInfo] Adapt indentation of instruction list printing. Change the indention of the last brace to align with the opening line. Before: Instructions { %val = fadd double %arg, 2.100000e+01 store double %val, double* %A } After: Instructions { %val = fadd double %arg, 2.100000e+01 store double %val, double* %A } llvm-svn: 308828	2017-07-22 16:44:39 +00:00
Michael Kruse	e5f4706a55	[ForwardOpTree] Support hoisted invariant loads. Hoisted loads can be trivially supported because there are no MemoryAccess to be modified, the loaded value is just available at code generation. llvm-svn: 308826	2017-07-22 14:30:02 +00:00
Michael Kruse	a6b2de3b59	[ForwardOpTree] Introduce the -polly-optree pass. This pass 'forwards' operand trees into statements that use them in order to avoid scalar dependencies. This minimal implementation handles only the case of speculatable instructions. We will successively add support for: - Hoisted loads - Read-only values - Synthesizable values - Loads - PHIs - Forwarding only parts of the tree Differential Revision: https://reviews.llvm.org/D35754 llvm-svn: 308825	2017-07-22 14:02:47 +00:00
Tobias Grosser	77eef90f50	Move ScopArrayInfo to isl++ This moves the full ScopArrayInfo class to isl++ llvm-svn: 308801	2017-07-21 23:07:56 +00:00
Philipp Schaad	2f3073b5cb	[Polly][GPGPU] Added SPIR Code Generation and Corresponding Runtime Support for Intel Summary: Added SPIR Code Generation to the PPCG Code Generator. This can be invoked using the polly-gpu-arch flag value 'spir32' or 'spir64' for 32 and 64 bit code respectively. In addition to that, runtime support has been added to execute said SPIR code on Intel GPU's, where the system is equipped with Intel's open source driver Beignet (development version). This requires the cmake flag 'USE_INTEL_OCL' to be turned on, and the polly-gpu-runtime flag value to be 'libopencl'. The transformation of LLVM IR to SPIR is currently quite a hack, consisting in part of regex string transformations. Has been tested (working) with Polybench 3.2 on an Intel i7-5500U (integrated graphics chip). Reviewers: bollu, grosser, Meinersbur, singam-sanjay Reviewed By: grosser, singam-sanjay Subscribers: pollydev, nemanjai, mgorny, Anastasia, kbarton Tags: #polly Differential Revision: https://reviews.llvm.org/D35185 llvm-svn: 308751	2017-07-21 16:11:06 +00:00
Michael Kruse	e186013149	Annotate dump() functions with LLVM_DUMP_METHOD. NFC. llvm-svn: 308749	2017-07-21 15:54:13 +00:00
Michael Kruse	5d5184698d	[ScopInfo] Don't compile dump() functions into non-assert builds. NFC. This follows a convention used in LLVM. llvm-svn: 308748	2017-07-21 15:54:07 +00:00
Michael Kruse	cd4c977b8b	[ScopInfo] Print instructions in dump(). Print a statement's instruction on dump() regardless of -polly-print-instructions. dump() is supposed to be used in the debugger only and never in regression tests. While debugging, get all the information we have and we are not bound to break anything. For non-dump purposes of print, forward the setting of -polly-print-instructions as parameters. Some calls to print() had to be changed because the PollyPrintInstructions setting is only available in ScopInfo.cpp. In ScheduleOptimizer.cpp, dump() was used in regression tests. That's not what dump() is for. The print parameter "PrintInstructions" will also be useful for an explicit print SCoP pass in a future patch. llvm-svn: 308746	2017-07-21 15:35:53 +00:00
Siddharth Bhat	06d4ed6787	[NFC] [RegisterPasses] Fix typo: To early -> too early. llvm-svn: 308743	2017-07-21 15:12:03 +00:00
Siddharth Bhat	a0fb8b23e1	[NFC] [PPCGCodeGeneration] Print `verifyModule` failure to debug stream. If verifyModule fails, it is helpful to know why it failed. Add a log to the debug stream that prints the failure. llvm-svn: 308727	2017-07-21 11:21:44 +00:00
Tobias Grosser	018103d34e	Fix typo in function name Bllock -> Block llvm-svn: 308715	2017-07-21 06:00:38 +00:00
Tobias Grosser	1eeedf4829	[IslNodeBuilder] Relax complexity check in invariant loads and run it early When performing invariant load hoisting we check that invariant load expressions are not too complex. Up to this commit, we performed this check by counting the sum of dimensions in the access range as a very simple heuristic. This heuristic is a little too conservative, as it prevents hoisting for any scops with a very large number of parameters. Hence, we update the heuristic to only count existentially quantified dimensions and set dimensions. We expect this to still detect the problematic expressions in h264 because of which this check was originally introduced. For some unknown reason, this complexity check was originally committed in IslNodeBuilder. It really belongs in ScopInfo, as there is no point in optimizing a program which we could have known earlier cannot be code generated. The benefit of running the check early is that we can avoid to even hoist checks that are expensive to code generate as invariant loads. This can be seen in the changed tests, where we now indeed detect the scop, but just not invariant load hoist the complicated access. We also improve the formatting of the code, document it, and use isl++ to simplify expressions. llvm-svn: 308659	2017-07-20 19:55:19 +00:00
Tobias Grosser	54491db687	Support fabs and copysign in Polly-ACC llvm-svn: 308649	2017-07-20 18:26:34 +00:00
Michael Kruse	b936c4b332	[PPCG] Compile fix for MSVC. Visual Studio, even the 2017 version, does not support C99 VLAs. For VLA paramters, the length of the outermost dimension is not required anyway, so remove it. llvm-svn: 308643	2017-07-20 18:04:54 +00:00
Michael Kruse	1ce6791e7e	[ScopInfo] Get a list of statements for a region node. NFC. When constructing a schedule true and there are multiple statements for a basic block, create a sequence node for these statements. Contributed-by: Nandini Singhal <cs15mtech01004@iith.ac.in> Differential Revision: https://reviews.llvm.org/D35679 llvm-svn: 308635	2017-07-20 17:18:58 +00:00
Michael Kruse	6eba4b1031	[ScopInfo] Remove dependency of Scop::getLastStmtFor(BB) on getStmtFor(BB). NFC. We are working towards removing uses of Scop::getStmtFor(BB). In this patch, we remove dependency of Scop::getLastStmtFor(BB) on getStmtFor(BB). To do so, we get the list of all statements corresponding to the BB and then fetch the last one. Contributed-by: Nandini Singhal <cs15mtech01004@iith.ac.in> Differential Revision: https://reviews.llvm.org/D35665 llvm-svn: 308633	2017-07-20 17:08:50 +00:00
Michael Kruse	3562f272cf	[ScopInfo] Use map for lookupPHIReadOf. NFC. Introduce previously missing PHIReads analogous the the already existing PHIWrites/ValueWrites/ValueReads maps. PHIReads was initially not required and the later introduced lookupPHIReadOf() used a linear search instead. With PHIReads, lookupPHIReadOf() can now also do a map lookup and remove any surprising performance/behaviour differences to lookupPHIWriteOf(), lookupValueWriteOf() and lookupValueReadOf(). llvm-svn: 308630	2017-07-20 16:47:57 +00:00
Michael Kruse	22058c3fbb	[Simplify] Remove unused instructions and accesses. Use a mark-and-sweep algorithm to find and remove unused instructions and MemoryAccesses. This is useful in particular to remove scalar writes that are never used anywhere. A scalar write in a loop induces a write-after-write dependency that stops the loop iterations to be rescheduled. Such writes can be a result of previous transformations such as DeLICM and operand tree forwarding. It adds a new class VirtualInstruction that represents an instruction in a particular statement. At the moment an instruction can only belong to the statement that represents a BasicBlock. In the future, instructions can be in one of multiple statements representing a BasicBlock (Nandini's work), in different statements than its BasicBlock would indicate, and even multiple statements at once (by forwarding operand trees). It also integrates nicely with the VirtualUse class. ScopStmt::contains(Instruction*) currently uses the instruction's parent BasicBlock to check whether it contains the instruction. It will need to check the actual statement list when one of the aforementioned features become possible. Differential Revision: https://reviews.llvm.org/D35656 llvm-svn: 308626	2017-07-20 16:21:55 +00:00
Siddharth Bhat	9e3db2b756	[PPCGCodeGen] [3/3] Update PPCGCodeGen + tests to latest ppcg. This commit WILL COMPILE. 1. `PPCG` now uses `isl_multi_pw_aff` instead of an array of `pw_aff`. This needs us to adjust how we index array bounds and how we construct array bounds. 2. `PPCG` introduces two new kinds of nodes: `init_device` and `clear_device`. We should investigate what the correct way to handle these are. 3. `PPCG` has gotten smarter with its use of live range reordering, so some of the tests have a qualitative improvement. 4. `PPCG` changed its output style, so many test cases need to be updated to fit the new style for `polly-acc-dump-code` checks. Differential Revision: https://reviews.llvm.org/D35677 llvm-svn: 308625	2017-07-20 15:48:36 +00:00
Siddharth Bhat	3d4d752188	[PPCG] [2/3] Make polly specific PPCG Changes. - This commit WILL NOT COMPILE. `PPCGCodeGeneration` requires changes since some of PPCG's internal data structures have been modified. - Has polly-speific changes to PPCG. Polly exports certain functionality that is private to PPCG. It also creates stubs for large parts of the pet API as well as other functions in `ppcg/external.c` to keep the linker happy. - This commit includes changes to CMakeLists.txt. Differential Revision: https://reviews.llvm.org/D35676 llvm-svn: 308624	2017-07-20 15:48:22 +00:00
Siddharth Bhat	951515f236	[PPCG] [1/3] Bump up PPCG version to 0.07. - This commit WILL NOT COMPILE, as it checks in vanilla PPCG 0.07 - We choose to introduce this commit into the history to cleanly display the Polly-specific changes made to PPCG. Differential Revision: https://reviews.llvm.org/D35675 llvm-svn: 308623	2017-07-20 15:48:13 +00:00
Michael Kruse	4642c3ce85	[ScopBuilder] Avoid use of getStmtFor(BB). NFC. Since there will be no more a 1:1 correspondence between statements and basic blocks, we would like to get rid of the method getStmtFor(BB) and its uses. Here we remove one of its uses in ScopBuilder by fetching the statement in which the instruction lies. Contributed-by: Nandini Singhal <cs15mtech01004@iith.ac.in> Differential Revision: https://reviews.llvm.org/D35610 llvm-svn: 308610	2017-07-20 12:47:09 +00:00
Michael Kruse	0865585eab	[ScopInfo] Add support for wrap-around of integers in unsigned comparisons. This is one possible solution to implement wrap-arounds for integers in unsigned icmp operations. For example, store i32 -1, i32* %A_addr %0 = load i32, i32* %A_addr %1 = icmp ult i32 %0, 0 %1 should hold false, because under the assumption of unsigned integers, -1 should wrap around to 2^32-1. However, previously. it was assumed that the MSB (Most Significant Bit - aka the Sign bit) was never set for integers in unsigned operations. This patch modifies the buildConditionSets function in ScopInfo.cpp to give better information about the integers in these unsigned comparisons. Contributed-by: Annanay Agarwal <cs14btech11001@iith.ac.in> Differential Revision: https://reviews.llvm.org/D35464 llvm-svn: 308608	2017-07-20 12:37:02 +00:00
Michael Kruse	89da6bbcb4	Make byref llvm::Use parameters const. NFC. llvm-svn: 308522	2017-07-19 20:41:56 +00:00
Michael Kruse	8b8058072f	[ScopInfo] Integrate ScalarDefUseChain into polly::Scop. NFC. Before this patch, ScalarDefUseChain was a tool used by DeLICM to find all reads and writes of scalar accesses. It iterated once over all accesses and stores the accesses into maps. By integrating it into the Scop class, we can keep the maps up-to-date without the need for recomputing them. It will be needed for more than DeLICM in the future, such as SCoP simplification, code movement between virtual statements, and array expansion (GSoC project). Compared to ScalarUseDefChain, we save two maps by finding the ScopStmt a Def/PHIRead must reside in, and use its already existing lookup function to find the MemoryAccess. Differential Revision: https://reviews.llvm.org/D35631 llvm-svn: 308495	2017-07-19 17:11:25 +00:00
Roman Gareev	750374181b	Make the pattern matching work with modified memory accesses Some optimizations (e.g., DeLICM) can modify memory accesses (e.g., change their MemoryKind). Consequently, the pattern matching should take it into the account. Reviewed-by: Tobias Grosser <tobias@grosser.es>, Michael Kruse <llvm@meinersbur.de> Differential Revision: https://reviews.llvm.org/D33138 llvm-svn: 308494	2017-07-19 16:59:06 +00:00
Tobias Grosser	199ec4af40	[ScopInfo] Do not create entries in map if non exists Suggested-by: Michael Kruse <llvm@meinersbur.de> llvm-svn: 308491	2017-07-19 16:31:10 +00:00
Michael Kruse	629f9185bf	[Simplify] Ensure all counters are reset before next SCoP is processed. NFC. llvm-svn: 308473	2017-07-19 14:07:21 +00:00
Tobias Grosser	303bd07c6e	[ScopInfo] Introduce tryGetValueStored Summary: This makes code more readable and allows to reuse this functionality in the future at other places. Suggested-by Michael Kruse in post-commit review of r307660. Reviewers: Meinersbur, bollu, gareevroman, efriedma, huihuiz, sebpop, simbuerg Reviewed By: Meinersbur Subscribers: pollydev, llvm-commits Tags: #polly Differential Revision: https://reviews.llvm.org/D35585 llvm-svn: 308435	2017-07-19 11:09:16 +00:00
Michael Kruse	4dfa732750	[ScopInfo] Introduce list of statements in Scop::StmtMap. NFC. Once statements are split, a BasicBlock will comprise of multiple statements. To prepare for this change in future, we introduce a list of statements in the statement map. Contributed-by: Nandini Singhal <cs15mtech01004@iith.ac.in> Differential Revision: https://reviews.llvm.org/D35301 llvm-svn: 308318	2017-07-18 15:41:49 +00:00
Siddharth Bhat	edfef5ae8e	[NFC] [PPCGCodeGeneration] cleanup kills related code. We extended kills in Polly to handle both `phi` nodes and scalars that are not used within the Scop. Update the comments and choice of variable names to reflect this. llvm-svn: 308279	2017-07-18 09:15:16 +00:00
Eli Friedman	e737fc120e	[Polly] [OptDiag] Updating Polly Diagnostics Remarks Utilizing newer LLVM diagnostic remark API in order to enable use of opt-viewer tool. Polly Diagnostic Remarks also now appear in YAML remark file. In this patch, I've added the OptimizationRemarkEmitter into certain classes where remarks are being emitted and update the remark emit calls itself. I also provide each remark a BasicBlock or Instruction from where it is being called, in order to compute the hotness of the remark. Patch by Tarun Rajendran! Differential Revision: https://reviews.llvm.org/D35399 llvm-svn: 308233	2017-07-17 23:58:33 +00:00
Tobias Grosser	66e38a84be	[Polly] Avoid use of `getStmtFor(BB)` in PolyhedralInfo. NFC Summary: Since there will be no more a 1-1 correspondence between statements and basic block, we would like to get rid of the method `getStmtFor(BB)` and its uses. Here we remove one of its uses in PolyhedralInfo, as suggested by Michael Sir. Reviewers: grosser, Meinersbur, bollu Reviewed By: grosser Subscribers: pollydev Tags: #polly Differential Revision: https://reviews.llvm.org/D35300 llvm-svn: 308220	2017-07-17 20:58:13 +00:00
Tobias Grosser	4556c9b8fe	[ScopInfo] Simplify new access functions under domain context Summary: We do not keep domain constraints on access functions when building the scop. Hence, for consistency reasons, it makes also sense to not include them when storing a new access function. This change results in simpler access functions that make output easier to read. This patch also helps to make DeLICMed memory accesses to be understood by our matrix multiplication pattern matching pass. Further changes to the matrix multiplication pattern matching are needed for this to work, so the corresponding test case will be added in a future commit. Reviewers: Meinersbur, bollu, gareevroman, efriedma, huihuiz, sebpop, simbuerg Subscribers: pollydev, llvm-commits Tags: #polly Differential Revision: https://reviews.llvm.org/D35237 llvm-svn: 308215	2017-07-17 20:47:10 +00:00
Siddharth Bhat	233d717ec1	[PPCGCodeGeneration] Generate invariant loads before trying to generate IR. - We should call `preloadInvariantLoads` to make sure that code is generated for invariant loads in the kernel. Differential Revision: https://reviews.llvm.org/D35410 llvm-svn: 308187	2017-07-17 15:57:01 +00:00
Tobias Grosser	21cbcf03d3	ScopInfo: Remove not-in-DomainMap statements in separate function This separates ScopBuilder internal and ScopBuilder external functionality. llvm-svn: 308152	2017-07-16 23:55:38 +00:00
Tobias Grosser	3012a0b302	Fix typo in comment [NFC] llvm-svn: 308149	2017-07-16 22:44:17 +00:00
Tobias Grosser	8e1280b8b2	[Polly] Fix a typo [NFC] Reviewers: grosser, Meinersbur, bollu Tags: #polly Contributed-by: Nandini Singhal <cs15mtech01004@iith.ac.in> Differential Revision: https://reviews.llvm.org/D35459 llvm-svn: 308134	2017-07-16 13:54:41 +00:00
Tobias Grosser	a3aa423fc3	[ScopDetection] If a loop is not part of a scop, none of it backedges can be This patch makes sure that in case a loop is not fully contained within a region that later forms a SCoP, none of the loop backedges are allowed to be part of the region. We currently do not support the situation where only some of a loops backedges are part of a scop. Today, this can break both scop modeling and code generation. One such breaking test case is for example test/ScopDetectionDiagnostics/loop_partially_in_scop-2.ll, where we totally forgot to code generate some of the backedges. Fortunately, it is commonly not necessary to support these partial loops, it is way more common that either no backedge is included in a region or all loop backedge are included. This fixes a recent miscompile in MultiSource/Benchmarks/MiBench/consumer-typeset which was exposed after r306477. llvm-svn: 308113	2017-07-15 22:42:17 +00:00
Tobias Grosser	325204a30e	[Polly] Translate Scop::DomainMap to islpp Reviewers: grosser, Meinersbur, bollu Subscribers: pollydev Tags: #polly Contributed-by: Nandini Singhal <cs15mtech01004@iith.ac.in> Differential Revision: https://reviews.llvm.org/D35453 llvm-svn: 308093	2017-07-15 12:41:32 +00:00
Tobias Grosser	13acbb91ee	[Polly] Use Isl c++ for InvalidDomainMap Reviewers: grosser, Meinersbur, bollu Subscribers: maxf, pollydev Tags: #polly Contributed-by: Nandini Singhal <cs15mtech01004@iith.ac.in> Differential Revision: https://reviews.llvm.org/D35308 llvm-svn: 308089	2017-07-15 09:01:31 +00:00
Tobias Grosser	231179ac70	update isl to: isl-0.18-791-ga22eb92 This is a regular maintenance update llvm-svn: 308013	2017-07-14 10:36:00 +00:00
Siddharth Bhat	03346c2701	[PPCGCodeGeneration] Fix runtime check adjustments since they make assumptions about BB layout. - There is a conditional branch that is used to switch between the old and new versions of the code. - If we detect that the build was unsuccessful, `PPCGCodeGeneration` will change the runtime check to be always set to false. - To actually reach this runtime check instruction, `PPCGCodeGeneration` was using assumptions about the layout of the BBs. - However, invariant load hoisting violates this assumption by inserting an extra basic block in the middle. - Fix the assumption on the layout by having `createScopConditionally` return the conditional branch instruction. - Use this reference to set to always-false. llvm-svn: 308010	2017-07-14 10:00:25 +00:00
Siddharth Bhat	a1b2086a33	[Invariant Loads] Do not consider invariant loads to have dependences. We need to relax constraints on invariant loads so that they do not create fake RAW dependences. So, we do not consider invariant loads as scalar dependences in a region. During these changes, it turned out that we do not consider `llvm::Value` replacements correctly within `PPCGCodeGeneration` and `ISLNodeBuilder`. The replacements dictated by `ValueMap` were not being followed in all places. This was fixed in this commit. There is no clean way to decouple this change because this bug only seems to arise when the relaxed version of invariant load hoisting was enabled. Differential Revision: https://reviews.llvm.org/D35120 llvm-svn: 307907	2017-07-13 12:18:56 +00:00
Singapuram Sanjay Srivallabh	1abd9ffa37	[PPCGCodeGen] Differentiate kernels based on their parent Scop Summary: Add a sequence number that identifies a ptx_kernel's parent Scop within a function to it's name to differentiate it from other kernels produced from the same function, yet different Scops. Kernels produced from different Scops can end up having the same name. Consider a function with 2 Scops and each Scop being able to produce just one kernel. Both of these kernels have the name "kernel_0". This can lead to the wrong kernel being launched when the runtime picks a kernel from its cache based on the name alone. This patch supplements D33985, by differentiating kernels across Scops as well. Previously (even before D33985) while profiling kernels generated through JIT e.g. Julia, [[ https://groups.google.com/d/msg/polly-dev/J1j587H3-Qw/mR-jfL16BgAJ \| kernels associated with different functions, and even different SCoPs within a function, would be grouped together due to the common name ]]. This patch prevents this grouping and the kernels are reported separately. Reviewers: grosser, bollu Reviewed By: grosser Subscribers: mehdi_amini, nemanjai, pollydev, kbarton Tags: #polly Differential Revision: https://reviews.llvm.org/D35176 llvm-svn: 307814	2017-07-12 16:46:19 +00:00
Tobias Grosser	bed2ca6eac	[Simplify] Also remove redundant writes which originally came from PHI nodes llvm-svn: 307660	2017-07-11 14:29:39 +00:00
Philip Pfaffe	d99c406e3d	[Polly][CMake] Use the CMake Package instead of llvm-config in out-of-tree builds Summary: As of now, Polly uses llvm-config to set up LLVM dependencies in an out-of-tree build. This is problematic for two reasons: 1) Right now, in-tree and out-of-tree builds in fact do different things. E.g., in an in-tree build, libPolly depends on a handful of LLVM libraries, while in an out-of-tree build it depends on all of them. This means that we often need to treat both paths seperately. 2) I'm specifically unhappy with the way libPolly is linked right now, because it just blindly links against all the LLVM libs. That doesn't make a lot of sense. For instance, one of these libs is LLVMTableGen, which contains a command line definition of a -o option. This means that I can not link an out-of-tree libPolly into a tool which might want to offer a -o option as well. This patch (mostly) drop the use of llvm-config in favor of LLVMs exported cmake package. However, building Polly with unittests requires access to the gtest sources (in the LLVM source tree). If we're building against an LLVM installation, this source tree is unavailable and must specified. I'm using llvm-config to provide a default in this case. Reviewers: Meinersbur, grosser Reviewed By: grosser Subscribers: tstellar, bollu, chapuni, mgorny, pollydev, llvm-commits Differential Revision: https://reviews.llvm.org/D33299 llvm-svn: 307650	2017-07-11 11:24:25 +00:00
Tobias Grosser	6a4c12fb33	Always export the latest memory access relations This allows us to export the results from transformations such as DeLICM. llvm-svn: 307641	2017-07-11 10:10:13 +00:00
Tobias Grosser	153a508349	[IslAst] Print memory accesses in AST dump When providing the option "-polly-ast-print-accesses" Polly also prints the memory accesses that are generated: #pragma known-parallel for (int c0 = 0; c0 <= 1023; c0 += 4) #pragma simd for (int c1 = c0; c1 <= c0 + 3; c1 += 1) Stmt_for_body( /* read / &MemRef_B[0] / write */ MemRef_A[c1] ); This makes writing and debugging memory layout transformations easier. Based on a patch contributed by Thomas Lang (ETH Zurich) llvm-svn: 307579	2017-07-10 20:13:06 +00:00
Tobias Grosser	f44f005a7d	Remove freed InvalidDomains from InvalidDomainMap. Summary: Since r306667, propagateInvalidStmtDomains gets a reference to an InvalidDomainMap. As part of the branch leading to return false, the respective domain is freed. It is, however, not removed from the InvalidDomainMap, leaking a pointer to a freed object which results in a use-after-free. Fix this be removing the domain from the map before returning. We tried to derive a test case that reliably failes, but did not succeed in producing one. Hence, for now the failures in our LNT bots must be sufficient to keep this issue tested. Reviewers: grosser, Meinersbur, bollu Subscribers: bollu, nandini12396, pollydev, llvm-commits Differential Revision: https://reviews.llvm.org/D34971 llvm-svn: 307499	2017-07-09 15:47:17 +00:00
Siddharth Bhat	761e5b9310	[Polly] [PPCGCodeGeneration] Teach `must_kills` to kill scalars that are local to the scop. - By definition, we can pass something as a `kill` to PPCG if we know that no data can flow across a kill. - This is useful for more complex examples where we have scalars that are local to a scop. - If the local is only used within a scop, we are free to kill it. Differential Revision: https://reviews.llvm.org/D35045 llvm-svn: 307260	2017-07-06 13:42:42 +00:00
Singapuram Sanjay Srivallabh	79f13b9a80	Prefix the name of the calling host function in the name of callee GPU kernel Summary: Provide more context to the name of a GPU kernel by prefixing its name with the host function that calls it. E.g. The first kernel called by `gemm` would be `FUNC_gemm_KERNEL_0`. Kernels currently follow the "kernel_#" (# = 0,1,2,3,...) nomenclature. This patch makes it easier to map host caller and device callee, especially when there are many kernels produced by Polly-ACC. Reviewers: grosser, Meinersbur, bollu, philip.pfaffe, kbarton! Reviewed By: grosser Subscribers: nemanjai, pollydev Tags: #polly Differential Revision: https://reviews.llvm.org/D33985 llvm-svn: 307173	2017-07-05 16:48:21 +00:00
Siddharth Bhat	47c7237bd8	[NFC] [ScopInfo] fix warning about construction order llvm-svn: 307164	2017-07-05 15:07:28 +00:00
Siddharth Bhat	a82f2d264a	[PPCGCodeGeneration] Teach Polly to start using live range reordering. Polly did not use PPCG's live range reordering feature. Teach PPCGCodeGeneration to use this. Documentation on this is sparse, so much of the code is conservative. We currently kill all phi nodes in a Scop by appending them to the must_kill map we pass to PPCG. I do not have a proof of correctness, but it seems to be intuitively correct. We also do not handle `array_order`, which, quoting PPCG, is: PPCG/gpu.h: "Order dependences on non-scalars." It seems to consist of RAW dependences between arrays. We need to pass this information for more complex privatization cases. Differential Revision: https://reviews.llvm.org/D34941 llvm-svn: 307163	2017-07-05 14:57:04 +00:00
Tobias Grosser	5e41458985	Bump isl to isl-0.18-768-g033b61ae Summary: This is a general maintenance update Reviewers: grosser Subscribers: srhines, fedor.sergeev, pollydev, llvm-commits Contributed-by: Maximilian Falkenstein <falkensm@student.ethz.ch> Differential Revision: https://reviews.llvm.org/D34903 llvm-svn: 307090	2017-07-04 15:54:11 +00:00
Singapuram Sanjay Srivallabh	02ca346e48	Introduce a hybrid target to generate code for either the GPU or CPU Summary: Introduce a "hybrid" `-polly-target` option to optimise code for either the GPU or CPU. When this target is selected, PPCGCodeGeneration will attempt first to optimise a Scop. If the Scop isn't modified, it is then sent to the passes that form the CPU pipeline, i.e. IslScheduleOptimizerPass, IslAstInfoWrapperPass and CodeGeneration. In case the Scop is modified, it is marked to be skipped by the subsequent CPU optimisation passes. Reviewers: grosser, Meinersbur, bollu Reviewed By: grosser Subscribers: kbarton, nemanjai, pollydev Tags: #polly Differential Revision: https://reviews.llvm.org/D34054 llvm-svn: 306863	2017-06-30 19:42:21 +00:00
Tobias Grosser	37c8ee7611	Fix typo llvm-svn: 306791	2017-06-30 06:30:51 +00:00
Michael Kruse	476f855ec8	[ScopInfo] Do not use ScopStmt in Domain derivation of ScopInfo. NFC ScopStmts were being used in the computation of the Domain of the SCoPs in ScopInfo. Once statements are split, there will not be a 1-to-1 correspondence between Stmts and Basic blocks. Thus this patch avoids the use of getStmtFor() by creating a map of BB to InvalidDomain and using it to compute the domain of the statements. Contributed-by: Nanidini Singhal <cs15mtech01004@iith.ac.in> Differential Revision: https://reviews.llvm.org/D33942 llvm-svn: 306667	2017-06-29 12:47:41 +00:00
NAKAMURA Takumi	6936506f50	Test commit llvm-svn: 306657	2017-06-29 09:46:01 +00:00
Singapuram Sanjay Srivallabh	42caad0257	Initializing NVPTX backend within Polly Summary: The NVPTX backend is now initialised within Polly. A language front-end need not be modified to initialise the backend, just for Polly. Reviewers: Meinersbur, grosser Reviewed By: Meinersbur Subscribers: vchuravy, mgorny Tags: #polly Differential Revision: https://reviews.llvm.org/D31859 llvm-svn: 306649	2017-06-29 07:43:22 +00:00
Michael Kruse	b738ffa845	Heap allocation for new arrays. This patch aims to implement the option of allocating new arrays created by polly on heap instead of stack. To enable this option, a key named 'allocation' must be written in the imported json file with the value 'heap'. We need such a feature because in a next iteration, we will implement a mechanism of maximal static expansion which will need a way to allocate arrays on heap. Indeed, the expansion is very costly in terms of memory and doing the allocation on stack is not worth considering. The malloc and the free are added respectively at polly.start and polly.exiting such that there is no use-after-free (for instance in case of Scop in a loop) and such that all memory cells allocated with a malloc are free'd when we don't need them anymore. We also add : - In the class ScopArrayInfo, we add a boolean as member called IsOnHeap which represents the fact that the array in allocated on heap or not. - A new branch in the method allocateNewArrays in the ISLNodeBuilder for the case of heap allocation. allocateNewArrays now takes a BBPair containing polly.start and polly.exiting. allocateNewArrays takes this two blocks and add the malloc and free calls respectively to polly.start and polly.exiting. - As IntPtrTy for the malloc call, we use the DataLayout one. To do that, we have modified : - createScopArrayInfo and getOrCreateScopArrayInfo such that it returns a non-const SAI, in order to be able to call setIsOnHeap in the JSONImporter. - executeScopConditionnaly such that it return both start block and end block of the scop, because we need this two blocs to be able to add the malloc and the free calls at the right position. Differential Revision: https://reviews.llvm.org/D33688 llvm-svn: 306540	2017-06-28 13:02:43 +00:00
Andreas Simbuerger	6d08ec7233	[JSONImport] Check, if the size of an imported array is positive llvm-svn: 306479	2017-06-27 22:30:44 +00:00
Andreas Simbuerger	dbb0ef8e94	[NFC][CodeGen] Use the ExitBlock explicitly. Before we would 'guess' the correct location for the MergeBlock that got introduced when executing a Scop conditionally. This implicitly depends on the situation that at this point during CodeGen there will be nothing between polly.start and polly.exiting. With this commit we explicitly state that we want the block that directly follows polly.exiting. llvm-svn: 306398	2017-06-27 11:33:22 +00:00
Siddharth Bhat	65d7f72f2c	[PPCGCodeGeneration] Add flag to allow polly to fail in GPU kernel fails. - This is useful for debugging GPU code. llvm-svn: 306290	2017-06-26 14:56:56 +00:00
Siddharth Bhat	f291c8d510	[PPCGCodeGeneration] Allow intrinsics within kernels. - In D33414, if any function call was found within a kernel, we would bail out. - This is an over-approximation. This patch changes this by allowing the `llvm.sqrt.*` family of intrinsics. - This introduces an additional step when creating a separate llvm::Module for a kernel (GPUModule). We now copy function declarations from the original module to new module. - We also populate IslNodeBuilder::ValueMap so it replaces the function references to the old module to the ones in the new module (GPUModule). Differential Revision: https://reviews.llvm.org/D34145 llvm-svn: 306284	2017-06-26 13:12:06 +00:00
Andreas Simbuerger	256070d85c	[NFC] Return both polly.start and polly.exiting from executeScopConditionally. This commit returns both the start and the exit block that are created by executeScopConditionally. In a future commit we will make use of the exit block. Before we would have to use the implicit property that there won't be any code generated between polly.start and polly.exiting at the time of use to find the correct block ('polly.exiting'). All usage location are semantically unchanged. llvm-svn: 306283	2017-06-26 12:17:11 +00:00
Siddharth Bhat	a12f807f33	[PPCGCodeGeneration] Enable GPU code generation with invariant loads. The condition that disallowed code generation in PPCGCodeGeneration with invariant loads is not required. I haven't been able to construct a counterexample where this generates invalid code. Differential Revision: https://reviews.llvm.org/D34604 llvm-svn: 306245	2017-06-25 14:48:24 +00:00
Tobias Grosser	1b9d1bcc6d	[ScopInfo] Bound the number of array disjuncts in run-time bounds checks This reduces the compilation time of one reduced test case from Android from 16 seconds to 100 mseconds (we bail out), without negatively impacting any other test case we currently have. We still saw occasionally compilation timeouts on the AOSP buildbot. Hopefully, those will go away with this change. llvm-svn: 306235	2017-06-25 06:32:00 +00:00
Michael Kruse	7604d9add5	[ScopBuilder] Pass ScopStmts around instead of BasicBlocks. NFC. During the construction of MemoryAccesses in ScopBuilder, BasicBlocks were used in function parameters, assuming that the ScopStmt an be directly derived from it. This won't be true anymore once we split BasicBlocks into multiple ScopStmt. As a preparation for such a change in the future, we instead pass the ScopStmt and avoid the use of getStmtFor(). There are two occasions where a kind of mapping from BasicBlock to ScopStmt is still required. 1. Get the statement representing the incoming block of a `PHINode` using `getLastStmtOf`. 2. One statement is required to write a scalar to be readable by those which need it. This is most often the statement which contains its definition, which we get using `getStmtFor(Instruction*)`. Differential Revision: https://reviews.llvm.org/D34369 llvm-svn: 306132	2017-06-23 17:55:36 +00:00
Tobias Grosser	78a7a6cddf	Bail out early in case we see an invalid runtime context in buildAliasGroups llvm-svn: 306088	2017-06-23 08:05:31 +00:00
Tobias Grosser	57a1d36d98	Hoist buildMinMaxAccess computeout to cover full alias-group This allows us to bail out both in case the lexmin/max computation is too expensive, but also in case the commulative cost across an alias group is too expensive. This is an improvement of r303404, which did not seem to be sufficient to keep the Android Buildbot quiet. llvm-svn: 306087	2017-06-23 08:05:27 +00:00
Tobias Grosser	8f23fb8486	[islpp] Move buildMinMaxAccess[es] to C++ [NFC] llvm-svn: 306086	2017-06-23 08:05:20 +00:00
Eli Friedman	5e589ea4b1	[ScopInfo] Fix crash with sum of invariant load and AddRec. r303971 added an assertion that SCEV addition involving an AddRec and a SCEVUnknown must involve a dominance relation: either the SCEVUnknown value dominates the AddRec's loop, or the AddRec's loop header dominates the SCEVUnknown. This is generally fine for most usage of SCEV because it isn't possible to write an expression in IR which would violate it, but it's a bit inconvenient here for polly. To solve the issue, just avoid creating a SCEV expression which triggers the asssertion. I'm not really happy with this solution, but I don't have any better ideas. Fixes https://bugs.llvm.org/show_bug.cgi?id=33464. Differential Revision: https://reviews.llvm.org/D34259 llvm-svn: 305864	2017-06-20 22:53:02 +00:00
Reid Kleckner	df2b283bf9	Fix -Wsign-compare in ScopInfo.cpp llvm::Loop::getNumBlocks returns an unsigned int, not a long. llvm-svn: 305717	2017-06-19 17:44:02 +00:00
Tobias Grosser	dcd94e3e93	[ScheduleOptimizer] Fix minor typo [NFC] llvm-svn: 305709	2017-06-19 16:55:48 +00:00
Tobias Grosser	2fb3ed200a	[ScheduleOptimizer] Move isolateFullPartialTiles and isolateAndUnrollMatMulInnerLoops to C++ llvm-svn: 305676	2017-06-19 10:40:12 +00:00
Michael Kruse	214deb7960	[CodeGen] Emit aliasing metadata for new arrays. Ensure that all array base pointers are assigned before generating aliasing metadata by allocating new arrays beforehand. Before this patch, getBasePtr() returned nullptr for new arrays because the arrays were created at a later point. Nullptr did not match to any array after the created array base pointers have been assigned and when the loads/stores are generated. llvm-svn: 305675	2017-06-19 10:19:29 +00:00
Eli Friedman	127e0cd21b	Don't check side effects for functions outside of SCoP In r304074 we introduce a patch to accept results from side effect free functions into SCEV modeling. This causes rejection of cases where the call is happening outside the SCoP. This patch checks if the call is outside the Region and treats the results as a parameter (SCEVType::PARAM) to the SCoP instead of returning SCEVType::INVALID. Patch by Sameer Abu Asal. llvm-svn: 305423	2017-06-14 22:43:28 +00:00
Siddharth Bhat	bccaea57c0	[Polly] [PPCGCodeGeneration] Skip Scops which contain function pointers. In `PPCGCodeGeneration`, we try to take the references of every `Value` that is used within a Scop to offload to the kernel. This occurs in `GPUNodeBuilder::createLaunchParameters`. This breaks if one of the values is a function pointer, since one of these cases will trigger: 1. We try to to take the references of an intrinsic function, and this breaks at `verifyModule`, since it is illegal to take the reference of an intrinsic. 2. We manage to take the reference to a function, but this fails at `verifyModule` since the function will not be present in the module that is created in the kernel. 3. Even if `verifyModule` succeeds (which should not occur), we would then try to call a host function from the device, which is illegal runtime behaviour. So, we disable this entire range of possibilities by simply not allowing function references within a `Scop` which corresponds to a kernel. However, note that this is too conservative. We can allow intrinsics within kernels if the backend can lower the intrinsic correctly. For example, an intrinsic like `llvm.powi.*` can actually be lowered by the `NVPTX` backend. We will now gradually whitelist intrinsics which are known to be safe. Differential Revision: https://reviews.llvm.org/D33414 llvm-svn: 305185	2017-06-12 11:41:09 +00:00
Siddharth Bhat	8139e2eb75	[NFC] Fix typo in `ImportJScop` declaration. Contributed by: Singapuram Sanjay Differential Revision: https://reviews.llvm.org/D34079 llvm-svn: 305183	2017-06-12 09:43:12 +00:00
Tobias Grosser	0b103d92c1	[isl-cpp] Remove isl/mat.h and add insert_partial_schedule The isl/mat.h functionality was incomplete (we returned 'void ' instead of 'isl::mat') and is likely not needed. .insert_partial_schedule was until know not exported in the bindings, but will be needed in the next step. llvm-svn: 305161	2017-06-11 04:39:21 +00:00
Siddharth Bhat	286c916dde	[Polly] [ScopDetection] Allow passing multiple functions to `-polly-only-func`. - This is useful to run optimisations on only certain functions. Differential Revision: https://reviews.llvm.org/D33990 llvm-svn: 305060	2017-06-09 08:23:40 +00:00
Michael Kruse	a6d48f59a1	Fix a lot of typos. NFC. llvm-svn: 304974	2017-06-08 12:06:15 +00:00
Tobias Grosser	4071cb571a	[ScopInfo] Translate getNonHoistableCtx to C++ [NFC] llvm-svn: 304841	2017-06-06 23:13:02 +00:00
Michael Kruse	281f414c9d	[JScop] Emit error messages on error. In importArrays instead of silently ignoring the file. llvm-svn: 304817	2017-06-06 19:17:32 +00:00
Michael Kruse	ad7a1805be	[Simplify] Use execution order of memory accesses. Iterate through memory accesses in execution order (first all implicit reads, then explicit accesses, then implicit writes). In the test case this caused an implicit load to be handled as if it was loaded after the write. That is, the value being written before it is available. This fixes llvm.org/PR33323 llvm-svn: 304810	2017-06-06 17:46:42 +00:00
Tobias Grosser	deefbced96	[Polly] [BlockGen] Support partial writes in regions Summary: The RegionGenerator traditionally kept a BlockMap that mapped from original basic blocks to newly generated basic blocks. With the introduction of partial writes such a 1:1 mapping is not possible any more, as a single basic block can be code generated into multiple basic blocks. Hence, depending on the use case we need to either use the first basic block or the last basic block. This is intended to address the last four cases of incorrect code generation in our AOSP buildbot and hopefully should turn it green. Reviewers: Meinersbur, bollu, gareevroman, efriedma, huihuiz, sebpop, simbuerg Reviewed By: Meinersbur Subscribers: pollydev, llvm-commits Tags: #polly Differential Revision: https://reviews.llvm.org/D33767 llvm-svn: 304808	2017-06-06 17:17:30 +00:00
Michael Kruse	be194d4efd	[CodeGen] Remove extra ';'. NFC. Fix compiler warning: polly/lib/CodeGen/PerfMonitor.cpp:81:2: warning: extra ‘;’ [-Wpedantic] }; ^ llvm-svn: 304802	2017-06-06 15:56:50 +00:00
Tobias Grosser	c4bfef50f3	Update isl to isl-0.18-679-g6e75a0d This is a regular maintenance update llvm-svn: 304686	2017-06-04 19:13:10 +00:00
Siddharth Bhat	726c28f8c4	[CodeGen] Track trip counts per-scop for performance measurement. - Add a counter that is incremented once on exit from a scop. - Test cases got split into two: one to test the cycles, and another one to test trip counts. - Sample output: ```name=sample-output.txt scop function, entry block name, exit block name, total time, trip count warmup, %entry.split, %polly.merge_new_and_old, 5180, 1 f, %entry.split, %polly.merge_new_and_old, 409944, 500 g, %entry.split, %polly.merge_new_and_old, 1226, 1 ``` Differential Revision: https://reviews.llvm.org/D33822 llvm-svn: 304543	2017-06-02 11:36:52 +00:00
Siddharth Bhat	a4dea6bb05	[CodeGen] Print performance counter information in CSV. This ensures that tools can parse performance information which Polly generates easily. - Sample output: ```name=out.csv scop function, entry block name, exit block name, total time warmup, %entry.split, %polly.merge_new_and_old, 1960 f, %entry.split, %polly.merge_new_and_old, 1238 g, %entry.split, %polly.merge_new_and_old, 1218 ``` - Example code to parse output: ```lang=python, name=example-parse.py import asciitable import sys table = asciitable.read('out.csv', delimiter=',') asciitable.write(table, sys.stdout, delimiter=',') ``` llvm-svn: 304533	2017-06-02 09:20:02 +00:00
Siddharth Bhat	fee75f4ba5	[NFC] [CodeGen] Bail out of per-scop performance reporting if not supported. We should bail out if performance monitoring is not supported, since we would have no information to print per-scop, and `FinalStartBB`, `ReturnFromFinal` would be `nullptr`. Assert that these are not `nullptr` if performance monitoring is supported. llvm-svn: 304529	2017-06-02 08:44:19 +00:00
Siddharth Bhat	07bee290de	[CodeGen] Extend Performance Counter to track per-scop information. Previously, we would generate one performance counter for all scops. Now, we generate both the old information, as well as a per-scop performance counter to generate finer grained information. This patch needed a way to generate a unique name for a `Scop`. The start region, end region, and function name combined provides a unique `Scop` name. So, `Scop` has a new public API to provide its start and end region names. Differential Revision: https://reviews.llvm.org/D33723 llvm-svn: 304528	2017-06-02 08:01:22 +00:00
Michael Kruse	3bb4829936	[CodeGen] Iterate over explicit instruction list for block statements. NFC For when statements do not contain all instructions of a BasicBlock anymore, the block generator needs to go through the explicit list of instructions it contains. Contributed-by: Nandini Singhal <cs15mtech01004@iith.ac.in> Differential Revision: https://reviews.llvm.org/D33653 llvm-svn: 304502	2017-06-02 00:13:49 +00:00
Michael Kruse	678aa336fa	[ScopBuilder] Exclude ignored intrinsics from explicit instruction list. Ignored intrinsics are ignored at code generation, therefore do not need to be part of the instruction list. Specifically, llvm.lifetime.* intrinisics are removed before code generation, referencing them would cause a use-after-free error. Contributed-by: Nandini Singhal <cs15mtech01004@iith.ac.in> Differential Revision: https://reviews.llvm.org/D33768 llvm-svn: 304483	2017-06-01 21:46:27 +00:00
Eli Friedman	de1b318dad	Add opt-bisect support to polly. This is useful for debugging miscompiles and extracting testcases for crashes. See http://llvm.org/docs/OptBisect.html . Differential Revision: https://reviews.llvm.org/D33752 llvm-svn: 304480	2017-06-01 21:29:05 +00:00
Tobias Grosser	dff902fca7	[ScopInfo] Do not lookup key twice [NFC] Suggested-by: Michael Kruse <llvm@meinersbur.de> llvm-svn: 304410	2017-06-01 12:46:51 +00:00
Tobias Grosser	f51decb5fe	[BlockGenerator] Take context into account when identifying partial writes A partial write is a write where the domain of the values written is a subset of the execution domain of the parent statement containing the write. Originally, we directly checked this subset relation whereas it is indeed only important that the subset relation holds for the parameter values that are known to be valid in the execution context of the scop. We update our check to avoid the unnecessary introduction of partial writes in situations where the write appears to be partial without context information, but where context information allows us to understand that a full write can be generated. This change fixes (hides) a recent regression introduced in r303517, which broke our AOSP builds. The part that is correctly fixed in this change is that we do not any more unnecessarily generate a partial write. This is good performance wise and, as we currently do not yet explicitly introduce partial writes in the default configuration, this also hides possible bugs in the partial writes implementation. The crashes that we have originally seen were caused by such a bug, where partial writes were incorrectly generated in region statements. An additional patch in a subsequent commit is needed to address this problem. Reported-by: Reported-by: Eli Friedman <efriedma@codeaurora.org> Differential Revision: https://reviews.llvm.org/D33759 llvm-svn: 304398	2017-06-01 09:34:20 +00:00
Tobias Grosser	6b6ac90098	[BlockGenerator] Translate buildContainsCondition to idiomatic isl C++ llvm-svn: 304354	2017-05-31 21:49:51 +00:00
Tobias Grosser	5ecc5166d9	[isl++] Update bindings This change removes the requirement for explicit conversions from isl::boolean to isl::bool, which resolves a compilation error on OSX. Suggested-by: Siddharth Bhat <siddu.druid@gmail.com> llvm-svn: 304288	2017-05-31 08:46:29 +00:00
Michael Kruse	ed0c2f7e90	[ScopInfo] Do not add terminator & synthesizable instructions to the output instructions. Such instructions are generates on-demand by the CodeGenerator and thus do not need representation in a statement. Differential Revision: https://reviews.llvm.org/D33642 llvm-svn: 304151	2017-05-29 12:27:38 +00:00
Siddharth Bhat	8bb436eb26	Revert "[NFC] Fix formatting & typecast issue. Build succeeds." Should not have 'fixed' the formatting issue, I did not have the most recent version of `clang-format`. This reverts commit 761b1268359e14e59142f253d77864a29d55c56c. llvm-svn: 304148	2017-05-29 11:34:29 +00:00
Siddharth Bhat	ede801ca2b	[NFC] Fix formatting & typecast issue. Build succeeds. - Fix formatting in `RegisterPasses.cpp`. - `assert` tried to compare `isl::boolean` against `long`. Explicitly construct `bool` from `isl::boolean`. This allows the implicit cast of `bool` to `long. llvm-svn: 304146	2017-05-29 11:00:31 +00:00
Tobias Grosser	d9fb2842e7	Adapt to recent clang-format changes llvm-svn: 304136	2017-05-29 08:06:29 +00:00
Tobias Grosser	1e55db30d5	Delinearize memory accesses that reference parameters coming from function calls Certain affine memory accesses which we model today might contain products of parameters which we might combined into a new parameter to be able to create an affine expression that represents these memory accesses. Especially in the context of OpenCL, this approach looses information as memory accesses such as A[get_global_id(0) * N + get_global_id(1)] are assumed to be linear. We correctly recover their multi-dimensional structure by assuming that parameters that are the result of a function call at IR level likely are not parameters, but indeed induction variables. The resulting access is now A[get_global_id(0)][get_global_id(1)] for an array A[][N]. llvm-svn: 304075	2017-05-27 15:18:53 +00:00
Tobias Grosser	f5e7e60bc8	Allow side-effect free function calls in valid affine SCEVs Side-effect free function calls with only constant parameters can be easily re-generated and consequently do not prevent us from modeling a SCEV. This change allows array subscripts to reference function calls such as 'get_global_id()' as used in OpenCL. We use the function name plus the constant operands to name the parameter. This is possible as the function name is required and is not dropped in release builds the same way names of llvm::Values are dropped. We also provide more readable names for common OpenCL functions, to make it easy to understand the polyhedral model we generate. llvm-svn: 304074	2017-05-27 15:18:46 +00:00
Tobias Grosser	6ea64d8bd3	Update isl to isl-0.18-662-g17e172e This is a general maintenance update llvm-svn: 304069	2017-05-27 11:09:39 +00:00
Tobias Grosser	d5fcbef8ee	[Polly] Added the list of Instructions to output in ScopInfo pass Summary: This patch outputs all the list of instructions in BlockStmts. Reviewers: Meinersbur, grosser, bollu Subscribers: bollu, llvm-commits, pollydev Differential Revision: https://reviews.llvm.org/D33163 llvm-svn: 304062	2017-05-27 04:40:18 +00:00
Tobias Grosser	9932086895	[ScopInfo] Translate mapToDimension to isl C++ [NFC] llvm-svn: 304007	2017-05-26 17:22:03 +00:00
Tobias Grosser	c8d13f50cc	[ScopInfo] Tighten compute out introduced in r303404 It seems we are still spending too much time on rare inputs, which continue to timeout the AOSP buildbot. Let's see if a further reduction is sufficient. llvm-svn: 303807	2017-05-24 21:24:04 +00:00
Philip Pfaffe	1a0128faaa	[Polly] Add handling of Top Level Regions Summary: My goal is to make the newly added `AllowWholeFunctions` options more usable/powerful. The changes to ScopBuilder.cpp are exclusively checks to prevent `Region.getExit()` from being dereferenced, since Top Level Regions (TLRs) don't have an exit block. In ScopDetection's `isValidCFG`, I removed a check that disallowed ReturnInstructions to have return values. This might of course have been intentional, so I would welcome your feedback on this and maybe a small explanation why return values are forbidden. Maybe it can be done but needs more changes elsewhere? The remaining changes in ScopDetection are simply to consider the AllowWholeFunctions option in more places, i.e. allow TLRs when it is set and once again avoid derefererncing `getExit()` if it doesn't exist. Finally, in ScopHelper.cpp I extended `polly::isErrorBlock` to handle regions without exit blocks as well: The original check was if a given BasicBlock dominates all predecessors of the exit block. Therefore I do the same for TLRs by regarding all BasicBlocks terminating with a ReturnInst as predecessors of a "virtual" function exit block. Patch by: Lukas Boehm Reviewers: philip.pfaffe, grosser, Meinersbur Reviewed By: grosser Subscribers: pollydev, llvm-commits, bollu Tags: #polly Differential Revision: https://reviews.llvm.org/D33411 llvm-svn: 303790	2017-05-24 18:39:39 +00:00
Michael Kruse	5f16986271	[DeLICM] Partial writes for PHIs. Enable the use for partial writes for PHI write accesses with a switch. This simply skips the test for whether a PHI write would be partial. The analog test for partial value writes also protects for partial reads which we do not support (yet). It is possible to test for partial reads separately such that we could skip the partial write check as well. In case this shows up to be useful, I can implement it as well. Differential Revision: https://reviews.llvm.org/D33487 llvm-svn: 303762	2017-05-24 15:23:06 +00:00
Michael Kruse	cb58bd6ccd	[JSONImporter] misses checks whether the data it imports makes sense. Without this patch, the JSONImporter did not verify if the data it loads were correct or not (Bug llvm.org/PR32543). I add some checks in the JSONImporter class and some test cases. Here are the checks (and test cases) I added : JSONImporter::importContext - The "context" key does not exist. - The context was not parsed successfully by ISL. - The isl_set has the wrong number of parameters. - The isl_set is not a parameter set. JSONImporter::importSchedule - The "statements" key does not exist. - There is not the right number of statement in the file. - The "schedule" key does not exist. - The schedule was not parsed successfully by ISL. JSONImporter::importAccesses - The "statements" key does not exist. - There is not the right number of statement in the file. - The "accesses" key does not exist. - There is not the right number of memory accesses in the file. - The "relation" key does not exist. - The memory access was not parsed successfully by ISL. JSONImporter::areArraysEqual - The "type" key does not exist. - The "sizes" key does not exist. - The "name" key does not exist. JSONImporter::importArrays /!\ Do not check if there is an key name "arrays" because it is not considered as an error. All checks are already in place or implemented in JSONImporter::areArraysEqual. Contributed-by: Nicolas Bonfante <nicolas.bonfante@insa-lyon.fr> Differential Revision: https://reviews.llvm.org/D32739 llvm-svn: 303759	2017-05-24 15:09:35 +00:00
Philip Pfaffe	483340bb83	[Polly][NewPM] Reenable ScopPassManager unittest llvm-svn: 303629	2017-05-23 11:28:50 +00:00
Philip Pfaffe	24a1bb2cf9	Post-commit fix of a comment llvm-svn: 303628	2017-05-23 11:25:05 +00:00
Philip Pfaffe	78ae52f0d6	[Polly][NewPM] Port CodeGen to the new PM Summary: To move CG to the new PM I outlined the various helper that were previously members of the CG class into free static functions. The CG class itself I moved into a header, which is required because we need to include it in `RegisterPasses` eventually. Reviewers: grosser, Meinersbur Reviewed By: grosser Subscribers: pollydev, llvm-commits, sanjoy Tags: #polly Differential Revision: https://reviews.llvm.org/D33423 llvm-svn: 303624	2017-05-23 10:18:12 +00:00
Philip Pfaffe	2b852e2e42	[Polly][NewPM] Port IslAst to the new ScopPassManager Summary: This patch ports IslAst to the new PM. The change is mostly straightforward. The only major modification required is making IslAst move-only, to correctly manage the isl resources it owns. Reviewers: grosser, Meinersbur Reviewed By: grosser Subscribers: nemanjai, pollydev, llvm-commits Tags: #polly Differential Revision: https://reviews.llvm.org/D33422 llvm-svn: 303622	2017-05-23 10:12:56 +00:00
Philip Pfaffe	3ef36fa222	[Polly][NewPM] Port DependenceInfo to the new ScopPassManager. Summary: This patch ports DependenceInfo to the new ScopPassManager. Printing is implemented as a seperate printer pass. Reviewers: grosser, Meinersbur Reviewed By: grosser Subscribers: llvm-commits, pollydev Tags: #polly Differential Revision: https://reviews.llvm.org/D33421 llvm-svn: 303621	2017-05-23 10:09:06 +00:00
Tobias Grosser	a32de1341e	[ScopInfo] Translate foldAccessRelation to isl C++ [NFC] llvm-svn: 303615	2017-05-23 07:22:56 +00:00
Tobias Grosser	53fc355e7d	[ScopInfo] Translate buildMemIntrinsicAccessRelation to isl C++ [NFC] llvm-svn: 303612	2017-05-23 07:07:09 +00:00
Tobias Grosser	1e2edaf3ea	[ScopInfo] Translate assumeNoOutOfBound to isl C++ [NFC] llvm-svn: 303611	2017-05-23 07:07:07 +00:00
Tobias Grosser	b1ed3d9749	[ScopInfo] Translate applyAndSetFAD to isl C++ llvm-svn: 303610	2017-05-23 07:07:05 +00:00
Tobias Grosser	2ade986c9e	[ScopInfo] Translate isReadOnly to isl C++ llvm-svn: 303608	2017-05-23 06:41:04 +00:00
Tobias Grosser	6d459c5d3d	[ScopInfo] Simplify domains early This speeds up scop modeling for scops with many redundent existentially quantified constraints. For the attached test case, this change reduces scop modeling time from minutes (hours?) to 0.15 seconds. This change resolves a compilation timeout on the AOSP build. Thanks Eli for reporting _and_ reducing the test case! Reported-by: Eli Friedman <efriedma@codeaurora.org> llvm-svn: 303600	2017-05-23 04:26:28 +00:00
Michael Kruse	1aad76c18f	[CodeGen] Add invalidation of the loop SCEVs after merge block generation. The SCEVs of loops surrounding the escape users of a merge blocks are forgotten, so that loop trip counts based on old values can be revoked. This fixes llvm.org//PR32536 Contributed-by: Baranidharan Mohan <mbdharan@gmail.com> Differential Revision: https://reviews.llvm.org/D33195 llvm-svn: 303561	2017-05-22 15:36:53 +00:00
Michael Kruse	706f79ab14	[CodeGen] Support partial write accesses. Allow the BlockGenerator to generate memory writes that are not defined over the complete statement domain, but only over a subset of it. It generates a condition that evaluates to 1 if executing the subdomain, and only then execute the access. Only write accesses are supported. Read accesses would require a PHINode which has a value if the access is not executed. Partial write makes DeLICM able to apply mappings that are not defined over the entire domain (for instance, a branch that leaves a loop with a PHINode in its header; a MemoryKind::PHI write when leaving is never read by its PHI read). Differential Revision: https://reviews.llvm.org/D33255 llvm-svn: 303517	2017-05-21 22:46:57 +00:00
Tobias Grosser	7be8245a40	[ScopInfo] Translate updateDimensionality to isl C++ [NFC] llvm-svn: 303514	2017-05-21 20:38:33 +00:00
Tobias Grosser	a3f7546931	[isl++] add isl_constraint to C++ bindings [NFC] llvm-svn: 303512	2017-05-21 20:23:26 +00:00
Tobias Grosser	3137f2cb65	[ScopInfo] Translate wrapConstantDimensions to isl C++ [NFC] llvm-svn: 303511	2017-05-21 20:23:23 +00:00
Tobias Grosser	99ea1d0808	[ScopInfo] Translate addRangeBoundsToSet to isl C++ [NFC] llvm-svn: 303510	2017-05-21 20:23:20 +00:00
Tobias Grosser	7205f93a98	[ScheduleOptimizer] Move schedule construction to isl C++ [NFC] llvm-svn: 303508	2017-05-21 16:21:33 +00:00
Tobias Grosser	b5f61bdeeb	[Simplify] Move to isl C++ llvm-svn: 303507	2017-05-21 16:12:21 +00:00
Tobias Grosser	6151654c00	[isl++] Export (almost) all functions from isl This commit exports the majority of the isl functions to the isl C++ interface. The official isl C++ bindings still require discussions to define the set of functions that are officially supported. As a result, the officially exported functionality will be rather limited until these discussions conclude and a non-trivial set of isl functions is officially supported through the isl C++ bindings. Starting from this commit we ship with Polly an extended version of the official isl C++ bindings to ensure sufficient functionality is available such that LLVM developers can make efficient use of isl through C++. The practical experience Polly gathers with its bindings will then be used to gradually upstream patches to isl to extend the official bindings. llvm-svn: 303506	2017-05-21 16:00:32 +00:00
Tobias Grosser	443f6814a1	[isl++] Rebase isl C++ bindings on top of 29aee98ce This reduces the diff to the official isl C++ bindings and solves a correctness issue with isl::booleans, where isl_bool_error results were accidentally converted to isl::boolean::true. llvm-svn: 303505	2017-05-21 15:59:15 +00:00
Tobias Grosser	3320485961	[isl++] Move isl raw_ostream printers into separate header Instead of relying on these functions to be part of the isl C++ bindings, we just define this functionality independently. This allows us to use isl C++ bindings that do not contain LLVM specific functionality. llvm-svn: 303503	2017-05-21 13:16:05 +00:00
Siddharth Bhat	b7f68b8c9e	[Fortran Support] Materialize outermost dimension for Fortran array. - We use the outermost dimension of arrays since we need this information to generate GPU transfers. - In general, if we do not know the outermost dimension of the array (because the indexing expression is non-affine, for example) then we simply cannot generate transfer code. - However, for Fortran arrays, we can use the Fortran array representation which stores the dimensions of all arrays. - This patch uses the Fortran array representation to generate code that computes the outermost dimension size. Differential Revision: https://reviews.llvm.org/D32967 llvm-svn: 303429	2017-05-19 15:07:45 +00:00
Tobias Grosser	d8945baa0a	[ScopDetection] Allow detection of full functions This is useful when only analyzing functions. llvm-svn: 303420	2017-05-19 12:13:02 +00:00
Tobias Grosser	977158488e	[ScopInfo] Fix typo in documentation llvm-svn: 303405	2017-05-19 04:01:52 +00:00
Tobias Grosser	45e9fd1810	[ScopInfo] Gracefully handle long compile times The following test case tried to compute the lexicographic minimum of the following set during alias analysis, which caused very long compile time: [p_0, p_1, p_2, p_3, p_4, p_5] -> { MemRef0[i0] : (517p_3 >= 70944 - 298p_2 and 256i0 >= -71199 + 298p_2 + 517p_3 and 256i0 <= -70944 + 298p_2 + 517p_3) or (409p_4 >= 57120 - 298p_2 and 256i0 >= -57375 + 298p_2 + 409p_4 and 256i0 <= -57120 + 298p_2 + 409p_4) or (104p_4 >= 17329 + 149p_2 - 50p_3 and 128i0 >= 17328 + 149p_2 - 50p_3 - 104p_4 and 128i0 <= 17455 + 149p_2 - 50p_3 - 104p_4) or (104p_4 <= 17328 + 149p_2 - 50p_3 and 128i0 >= 17201 + 149p_2 - 50p_3 - 104p_4 and 128i0 <= 17328 + 149p_2 - 50p_3 - 104p_4) or (409p_4 <= 57119 - 298p_2 and 256i0 >= -57120 + 298p_2 + 409p_4 and 256i0 <= -56865 + 298p_2 + 409p_4) or (517p_3 <= 70943 - 298p_2 and 256i0 >= -70944 + 298p_2 + 517p_3 and 256i0 <= -70689 + 298p_2 + 517p_3) or (p_1 >= 2 + 2p_0 and 298p_5 >= 70944 - 517p_3 and 256i0 >= -71199 + 517p_3 + 298p_5 and 256i0 <= -70944 + 517p_3 + 298p_5) or (p_1 >= 2 + 2p_0 and 298p_5 >= 57120 - 409p_4 and 256i0 >= -57375 + 409p_4 + 298p_5 >and 256i0 <= -57120 + 409p_4 + 298p_5) or (p_1 >= 2 + 2p_0 and 149p_5 <= -17329 >+ 50p_3 + 104p_4 and 128i0 >= 17328 - 50p_3 - 104p_4 + 149p_5 and 128i0 <= >17455 - 50p_3 - 104p_4 + 149p_5) or (p_1 >= 2 + 2p_0 and 149p_5 >= -17328 + >50p_3 + 104p_4 and 128i0 >= 17201 - 50p_3 - 104p_4 + 149p_5 and 128i0 <= 17328 >- 50p_3 - 104p_4 + 149p_5) or (p_1 >= 2 + 2p_0 and 298p_5 <= 57119 - 409p_4 and >256i0 >= -57120 + 409p_4 + 298p_5 and 256i0 <= -56865 + 409p_4 + 298p_5) or >(p_1 >= 2 + 2p_0 and 298p_5 <= 70943 - 517p_3 and 256i0 >= -70944 + 517p_3 + >298p_5 and 256i0 <= -70689 + 517p_3 + 298p_5) } We now guard the potentially expensive functions in Polly's scop analysis to gracefully bail out in case of overly long compilation times. llvm-svn: 303404	2017-05-19 03:45:00 +00:00
Michael Kruse	960c0d0b04	[ScopInfo] Fix r302231 to use logical or (\|\|). NFC. In r302231 we mistakenly use bitwise or (\|) instead of logical or (\|\|). This patch fixes that. Contributed-by: Sameer AbuAsal <sabuasal@codeaurora.org> Differential Revision: https://reviews.llvm.org/D33337 llvm-svn: 303386	2017-05-18 21:55:36 +00:00
Reid Kleckner	96ab8726a3	[IR] De-virtualize ~Value to save a vptr Summary: Implements PR889 Removing the virtual table pointer from Value saves 1% of RSS when doing LTO of llc on Linux. The impact on time was positive, but too noisy to conclusively say that performance improved. Here is a link to the spreadsheet with the original data: https://docs.google.com/spreadsheets/d/1F4FHir0qYnV0MEp2sYYp_BuvnJgWlWPhWOwZ6LbW7W4/edit?usp=sharing This change makes it invalid to directly delete a Value, User, or Instruction pointer. Instead, such code can be rewritten to a null check and a call Value::deleteValue(). Value objects tend to have their lifetimes managed through iplist, so for the most part, this isn't a big deal. However, there are some places where LLVM deletes values, and those places had to be migrated to deleteValue. I have also created llvm::unique_value, which has a custom deleter, so it can be used in place of std::unique_ptr<Value>. I had to add the "DerivedUser" Deleter escape hatch for MemorySSA, which derives from User outside of lib/IR. Code in IR cannot include MemorySSA headers or call the MemoryAccess object destructors without introducing a circular dependency, so we need some level of indirection. Unfortunately, no class derived from User may have any virtual methods, because adding a virtual method would break User::getHungOffOperands(), which assumes that it can find the use list immediately prior to the User object. I've added a static_assert to the appropriate OperandTraits templates to help people avoid this trap. Reviewers: chandlerc, mehdi_amini, pete, dberlin, george.burgess.iv Reviewed By: chandlerc Subscribers: krytarowski, eraman, george.burgess.iv, mzolotukhin, Prazek, nlewycky, hans, inglorion, pcc, tejohnson, dberlin, llvm-commits Differential Revision: https://reviews.llvm.org/D31261 llvm-svn: 303362	2017-05-18 17:24:10 +00:00
Siddharth Bhat	06e3c74d83	[Fortran Support] Change "global" pattern match to work for params Summary: - Rename global / local naming convention that did not make much sense to Visible / Invisible, where the visible refers to whether the ALLOCATE call to the Fortran array is present in the current module or not. - This match now works on both cross fortran module globals and on parameters to functions since neither of them are necessarily allocated at the point of their usage. - Add testcase that matches against both a load and a store against function parameters. Differential Revision: https://reviews.llvm.org/D33190 llvm-svn: 303356	2017-05-18 16:47:13 +00:00
Tobias Grosser	ff3f38b2c5	Adjust formatting llvm-svn: 303065	2017-05-15 14:12:27 +00:00
Philip Pfaffe	35bdcaf9e9	[Polly][NewPM][WIP] Add a ScopPassManager This patch adds both a ScopAnalysisManager and a ScopPassManager. The ScopAnalysisManager is itself a Function-Analysis, and manages analyses on Scops. The ScopPassManager takes care of building Scop pass pipelines. This patch is marked WIP because I've left two FIXMEs which I need to think about some more. Both of these deal with invalidation: Deferred invalidation is currently not implemented. Deferred invalidation deals with analyses which cache references to other analysis results. If these results are invalidated, invalidation needs to be propagated into the caching analyses. The ScopPassManager as implemented assumes that ScopPasses do not affect other Scops in any way. There has been some discussion about this on other patch threads, however it makes sense to reiterate this for this specific patch. I'm uploading this patch even though it's incomplete to encourage discussion and give you an impression of how this is going to work. Differential Revision: https://reviews.llvm.org/D33192 llvm-svn: 303062	2017-05-15 13:43:01 +00:00
Philip Pfaffe	838e0884ef	[Polly][NewPM] Port ScopInfo to the new PassManager llvm-svn: 303056	2017-05-15 12:55:14 +00:00
Siddharth Bhat	0fe7231a2f	[Fortran Support] Add pattern match for Fortran Arrays that are parameters. - This breaks the previous assumption that Fortran Arrays are `GlobalValue`. - The names of functions were getting unwieldy. So, I renamed the Fortran related functions. Differential Revision: https://reviews.llvm.org/D33075 llvm-svn: 303040	2017-05-15 08:41:30 +00:00
Siddharth Bhat	9746f817ea	[Simplify] Fix r302986 that introduced non-inferrable templates. - auto + decltype + template use was not inferrable in `Transform/Simplify.cpp accessesInOrder`. - changed code to explicitly construct required vector instead of using higher order iterator helpers. - Failing compiler spec: Apple LLVM version 7.3.0 (clang-703.0.31) Target: x86_64-apple-darwin15.6.0 llvm-svn: 303039	2017-05-15 08:18:51 +00:00
Tobias Grosser	497fdd7dff	[Simplify] Remove some leftover dead code llvm-svn: 303007	2017-05-14 09:20:56 +00:00
Tobias Grosser	b693f42b71	[Polly] Fix code generation of llvm.expect intrinsic At the time of code generation, an instruction with an llvm intrinsic is ignored in copyBB. However, if the value of the instruction is used later in the program, the value needs to be synthesized. However, this is causing some issues with the instructions being generated in a hoisted basic block. Removing llvm.expect from the list of ignored intrinsics fixes this bug. This resolves http://llvm.org/PR32324. Contributed-by: Annanay Agarwal <cs14btech11001@iith.ac.in> Tags: #polly Differential Revision: https://reviews.llvm.org/D32992 llvm-svn: 303006	2017-05-14 09:09:54 +00:00
Michael Kruse	fa7be88378	[Simplify] Remove identical write removal. NFC. Removal of overwritten writes currently encompasses all the cases of the identical write removal. There is an observable behavioral change in that the last, instead of the first, MemoryAccess is kept. This should not affect the generated code, however. Differential Revision: https://reviews.llvm.org/D33143 llvm-svn: 302987	2017-05-13 12:20:57 +00:00
Michael Kruse	f263610b82	[Simplify] Remove writes that are overwritten. Remove memory writes that are overwritten by later writes. This works for StoreInsts: store double 21.0, double* %A store double 42.0, double* %A scalar writes at the end of a statement and mixes of these. Multiple writes can be the result of DeLICM, which might map multiple writes to the same location when it knows that these do no conflict (for instance because they write the same value). Such writes interfere with pattern-matched optimization such as gemm and may not get removed by other LLVM passes after code generation. Differential Revision: https://reviews.llvm.org/D33142 llvm-svn: 302986	2017-05-13 11:49:34 +00:00
Michael Kruse	aeb4864090	[Simplify] Reset all stats between runs. llvm-svn: 302926	2017-05-12 17:23:07 +00:00
Philip Pfaffe	5cc87e3ab3	[Polly][NewPM] Port ScopDetection to the new PassManager Summary: This is a proof of concept of how to port polly-passes to the new PassManager architecture. This approach works ootb for Function-Passes, but might not be directly applicable to Scop/Region-Passes. While we could just run the Analyses/Transforms over functions instead, we'd surrender the nice pipelining behaviour we have now. Reviewers: Meinersbur, grosser Reviewed By: grosser Subscribers: pollydev, sanjoy, nemanjai, llvm-commits Tags: #polly Differential Revision: https://reviews.llvm.org/D31459 llvm-svn: 302902	2017-05-12 14:37:29 +00:00
Hongbin Zheng	5b263d4ce1	[Polly] Remove unused header llvm-svn: 302868	2017-05-12 02:21:50 +00:00
Hongbin Zheng	4fe342cb75	[Polly] Generate more 'canonical' induction variable Today Polly generates induction variable in this way: polly.indvar = phi 0, polly.indvar.next ... polly.indvar.next = polly.indvar + stide polly.loop_cond = predicate polly.indvar, (UB - stride) Instead of: polly.indvar = phi 0, polly.indvar.next ... polly.indvar.next = polly.indvar + stide polly.loop_cond = predicate polly.indvar.next, UB The way Polly generate induction variable cause some problem in the indvar simplify pass. This patch make polly generate the later form, by assuming the induction variable never overflow Differential Revision: https://reviews.llvm.org/D33089 llvm-svn: 302866	2017-05-12 02:17:15 +00:00
Michael Kruse	d644ec7647	[DeLICM] Use input access heuristic for mapped PHI WRITEs. As with the scalar operand of the initial StoreInst, also use input accesses when searching for new opportunities after mapping a PHI write. The same rational applies here: After LICM has been applied, the promoted value will either be an instruction in the same statement (in which case we fall back to try every scalar access of the statement), or in another statement such that there will be such an input access. In the latter case other scalars cannot have originated from the same register promotion, at least not by LICM. This mostly helps to decrease compilation time and makes debugging easier by not pursuing unpromising routes. In some circumstances, it may change the compiler's output. llvm-svn: 302839	2017-05-11 22:56:59 +00:00
Michael Kruse	4c27643398	[DeLICM] Lookup input accesses. Previous to this patch, we used VirtualUse to determine the input access of an llvm::Value in a statement. The input access is the READ MemoryAccess that makes a value available in that statement, which can either be a READ of a MemoryKind::Value or the MemoryKind::PHI for a PHINode in the statement. DeLICM uses the input access to heuristically find a candidate to map without searching all possible values. This might modify the behaviour in that previously PHI accesses were not considered input accesses before. This was unintentially lost when "VirtualUse" was extracted from the "Known Knowledge" patch. llvm-svn: 302838	2017-05-11 22:56:46 +00:00
Michael Kruse	bfaa1857b3	[VirtualInstruction] Do a lookup instead of a linear search. NFC. llvm-svn: 302837	2017-05-11 22:56:27 +00:00
Michael Kruse	e60eca7316	[ScopInfo] Keep scalar acceess dictionaries up-to-data. NFC. When removing a MemoryAccess, also remove it from maps pointing to it. This was already done for InstructionToAccess, but not yet for ValueReads, ValueWrites and PHIWrites as those were only used during the ScopBuilder phase. Keeping them updated allows us to use them later as well. llvm-svn: 302836	2017-05-11 22:56:12 +00:00
Michael Kruse	07e315e780	[Simplify] Remove identical scalar writes. After DeLICM, it is possible to have two writes of the same value to the same location in the same statement when it determined that those writes do not conflict (write the same value). Teach -polly-simplify to remove one of the writes. It interferes with the pattern matching of matrix-multiplication kernels and also seem to not be optimized away by LLVM. The algorthm is simple, has O(n^2) behaviour (n = max number of MemoryAccesses in a statement) and only matches the most obvious cases, but seem to be enough to pattern-match Boost ublas gemm. Not handled cases include: - StoreInst instructions (a.k.a. explicit writes), since the value might be loaded or overwritten between the two stores. - PHINode, especially LCSSA, when the PHI value matches with on other's. - Partial writes (in preparation) llvm-svn: 302805	2017-05-11 15:07:38 +00:00
Michael Kruse	a0987b83d5	[Simplify] Mark variables as used. NFC. Mark one more variable as used that is needed in assertions. llvm-svn: 302726	2017-05-10 20:45:10 +00:00
Michael Kruse	4aac59cee1	[Simplify] Mark variables as used. NFC. Mark variables as used that are needed in assertions. llvm-svn: 302725	2017-05-10 20:42:02 +00:00
Michael Kruse	f41f274bf8	[DeLICM] Avoid compiler warning. NFC. gcc 5.4 warns about using a C-style case to case away a const. Use case a const_cast instead. llvm-svn: 302715	2017-05-10 19:58:52 +00:00
Michael Kruse	f69a7c306b	[DeLICM] Always normalize domain. NFC. Some isl functions can simplify their __isl_keep arguments. The argument object after the call uses different contraints to represent the same set. Different contraints can result in different outputs when printed to a string. In assert builds additional isl functions are called (in assert() or mentioned, these can change the internal representation of its read-only arguments such that printed strings are different in debug and non-debug builds. What happened here is that a call to isl_set_is_equal inside an assert in getScatterFor normalizes one of its arguments such that one redundant constraint is removed. The redundant constraint therefore does not appear in the string representing the domain, which FileCheck notices as a regression test failure compared to a build with assertions disabled. This fix removes the redundant contraints the domain from the start such that the redundant contraint is removed in assert and non-assert builds. Isl adds a flag to such sets such that the removal of redundancies is not done multiple times (here: by isl_set_is_equal). Thanks to Tobias Grosser for reporting and hinting to the cause. llvm-svn: 302711	2017-05-10 19:50:45 +00:00
Siddharth Bhat	c47f039efd	[Fix] [Fortran Support] Fix variable name & make testcase activate on release There was: #ifdef NDEBUG This should be: #ifndef NDEBUG Also, the variable name was incorrect. Fixed the variable name. llvm-svn: 302696	2017-05-10 17:27:48 +00:00
Siddharth Bhat	f2dbba8183	[Fortran Support] Detect Fortran arrays & metadata from dragonegg output Add the ability to tag certain memory accesses as those belonging to Fortran arrays. We do this by pattern matching against known patterns of Dragonegg's LLVM IR output from Fortran code. Fortran arrays have metadata stored with them in a struct. This struct is called the "Fortran array descriptor", and a reference to this is stored in each MemoryAccess. Differential Revision: https://reviews.llvm.org/D32639 llvm-svn: 302653	2017-05-10 13:11:20 +00:00
Tobias Grosser	f3adab4c20	[Polly] Canonicalize arrays according to base-ptr equivalence class Summary: In case two arrays share base pointers in the same invariant load equivalence class, we canonicalize all memory accesses to the first of these arrays (according to their order in the equivalence class). This enables us to optimize kernels such as boost::ublas by ensuring that different references to the C array are interpreted as accesses to the same array. Before this change the runtime alias check for ublas would fail, as it would assume models of the C array with differing (but identically valued) base pointers would reference distinct regions of memory whereas the referenced memory regions were indeed identical. As part of this change we remove most of the MemoryAccess::getBaseAddr interface. We removed already all references to getBaseAddr in previous commits to ensure that no code relies on matching base pointers between memory accesses and scop arrays -- except for three remaining uses where we need the original base pointer. We document for these situations that MemoryAccess::getOriginalBaseAddr may return a base pointer that is distinct to the base pointer of the scop array referenced by this memory access. Reviewers: sebpop, Meinersbur, zinob, gareevroman, pollydev, huihuiz, efriedma, jdoerfert Reviewed By: Meinersbur Subscribers: etherzhhb Tags: #polly Differential Revision: https://reviews.llvm.org/D28518 llvm-svn: 302636	2017-05-10 10:59:58 +00:00
Tobias Grosser	1a2e0e6415	Fix formatting in Polly llvm-svn: 302620	2017-05-10 04:53:59 +00:00
Chandler Carruth	d742e5efa8	Update Polly for LLVM API change r302571 that removed varargs functions with a nullptr sentinel in favor of nicely typed variadic templates. llvm-svn: 302618	2017-05-10 02:39:35 +00:00
Siddharth Bhat	a90be207c6	[Polly][PPCGCodeGen] OpenCL now gets kernel argument size from PPCG CodeGen Summary: PPCGCodeGeneration now attaches the size of the kernel launch parameters at the end of the parameter list. For the existing CUDA Runtime, this gets ignored, but the OpenCL Runtime knows to check for kernel-argument size at the end of the parameter list. (The resulting parameters list is twice as long. This has been accounted for in the corresponding test cases). Reviewers: grosser, Meinersbur, bollu Reviewed By: bollu Subscribers: nemanjai, yaxunl, Anastasia, pollydev, llvm-commits Tags: #polly Differential Revision: https://reviews.llvm.org/D32961 llvm-svn: 302515	2017-05-09 10:45:52 +00:00
Siddharth Bhat	17f01968f1	[Polly] Added OpenCL Runtime to GPURuntime Library for GPGPU CodeGen Summary: When compiling for GPU, one can now choose to compile for OpenCL or CUDA, with the corresponding polly-gpu-runtime flag (libopencl / libcudart). The GPURuntime library (GPUJIT) has been extended with the OpenCL Runtime library for that purpose, correctly choosing the corresponding library calls to the option chosen when compiling (via different initialization calls). Additionally, a specific GPU Target architecture can now be chosen with -polly-gpu-arch (only nvptx64 implemented thus far). Reviewers: grosser, bollu, Meinersbur, etherzhhb, singam-sanjay Reviewed By: grosser, Meinersbur Subscribers: singam-sanjay, llvm-commits, pollydev, nemanjai, mgorny, yaxunl, Anastasia Tags: #polly Differential Revision: https://reviews.llvm.org/D32431 llvm-svn: 302379	2017-05-07 21:03:46 +00:00
Michael Kruse	5ae08c0ebb	[DeLICM] Known knowledge. Extend the Knowledge class to store information about the contents of array elements and which values are written. Two knowledges do not conflict the known content is the same. The content information if computed from writes to and loads from the array elements, and represented by "ValInst": isl spaces that compare equal if the value represented is the same. Differential Revision: https://reviews.llvm.org/D31247 llvm-svn: 302339	2017-05-06 14:03:58 +00:00
Michael Kruse	2a8f6f843f	[CMake] Introduce POLLY_BUNDLED_JSONCPP. Allow using a system's install jsoncpp library instead of the bundled one with the setting POLLY_BUNDLED_JSONCPP=OFF. This fixes llvm.org/PR32929 Differential Revision: https://reviews.llvm.org/D32922 llvm-svn: 302336	2017-05-06 13:42:15 +00:00
Michael Kruse	391a2ac09b	[ScopBuilder] Move Scop::init to ScopBuilder. NFC. Scop::init is used only during SCoP construction. Therefore ScopBuilder seems the more appropriate place for it. We integrate it onto its only caller ScopBuilder::buildScop where some other construction steps already took place. Differential Revision: https://reviews.llvm.org/D32908 llvm-svn: 302276	2017-05-05 20:09:08 +00:00
Michael Kruse	f1052ceb5e	[ScopBuilder] Do not verify unfeasible SCoPs. SCoPs with unfeasible runtime context are thrown away and therefore do not need their uses verified. The added test case requires a complexity limit to exceed. Normally, error statements are removed from the SCoP and for that reason are skipped during the verification. If there is a unfeasible runtime context (here: because of the complexity limit being reached), the removal of error statements and other SCoP construction steps are skipped to not waste time. Error statements are not modeled in SCoPs and therefore have no requirements on whether the scalars used in them are available. llvm-svn: 302234	2017-05-05 13:38:35 +00:00
Tobias Grosser	d5727c5011	Fix handling of signWrappedSets in access relations Since r294891, in MemoryAccess::computeBoundsOnAccessRelation(), we skip manually bounding the access relation in case the parameter of the load instruction is already a wrapped set. Later on we assume that the lower bound on the set is always smaller or equal to the upper bound on the set. Bug 32715 manages to construct a sign wrapped set, in which case the assertion does not necessarily hold. Fix this by handling a sign wrapped set similar to a normal wrapped set, that is skipping the computation. Contributed-by: Maximilian Falkenstein <falkensm@student.ethz.ch> Reviewers: grosser Subscribers: pollydev, llvm-commits Tags: #Polly Differential Revision: https://reviews.llvm.org/D32893 llvm-svn: 302231	2017-05-05 13:20:47 +00:00
Siddharth Bhat	c1267b9baa	Revert "[Polly] Added OpenCL Runtime to GPURuntime Library for GPGPU CodeGen" This reverts commit 17a84e414adb51ee375d14836d4c2a817b191933. Patches should have been submitted in the order of: 1. D32852 2. D32854 3. D32431 I mistakenly pushed D32431(3) first. Reverting to push in the correct order. llvm-svn: 302217	2017-05-05 09:02:08 +00:00
Siddharth Bhat	51904ae35a	[Polly] Added OpenCL Runtime to GPURuntime Library for GPGPU CodeGen Summary: When compiling for GPU, one can now choose to compile for OpenCL or CUDA, with the corresponding polly-gpu-runtime flag (libopencl / libcudart). The GPURuntime library (GPUJIT) has been extended with the OpenCL Runtime library for that purpose, correctly choosing the corresponding library calls to the option chosen when compiling (via different initialization calls). Additionally, a specific GPU Target architecture can now be chosen with -polly-gpu-arch (only nvptx64 implemented thus far). Reviewers: grosser, bollu, Meinersbur, etherzhhb, singam-sanjay Reviewed By: grosser, Meinersbur Subscribers: singam-sanjay, llvm-commits, pollydev, nemanjai, mgorny, yaxunl, Anastasia Tags: #polly Differential Revision: https://reviews.llvm.org/D32431 llvm-svn: 302215	2017-05-05 07:54:49 +00:00
Michael Kruse	704c03e03b	[ScopBuilder] Add missing semicolon after LLVM_FALLTHROUGH. It was forgotten in r302157. llvm-svn: 302163	2017-05-04 15:55:54 +00:00
Michael Kruse	eedae7630a	Introduce VirtualUse. NFC. If a ScopStmt references a (scalar) value, there are multiple possibilities where this value can come. The decision about what kind of use it is must be handled consistently at different places, which can be error-prone. VirtualUse is meant to centralize the handling of the different types of value uses. This patch makes ScopBuilder and CodeGeneration use VirtualUse. This already helps to show inconsistencies with the value handling. In order to keep this patch NFC, exceptions to the general rules are added. These might be fixed later if they turn to problems. Overall, this should result in fewer post-codegen IR-verification errors, but instead assertion failures in `getNewValue` that are closer to the actual error. Differential Revision: https://reviews.llvm.org/D32667 llvm-svn: 302157	2017-05-04 15:22:57 +00:00
Tobias Grosser	3f25a7e8ee	[ScopDetection] Check for already known required-invariant loads [NFC] For certain test cases we spent over 50% of the scop detection time in checking if a load is likely invariant. We can avoid most of these checks by testing early on if a load is expected to be invariant. Doing this reduces scop-detection time on a large benchmark from 52 seconds to just 25 seconds. No functional change is expected. llvm-svn: 302134	2017-05-04 10:16:20 +00:00
Tobias Grosser	e2ccc3fb33	[ScopInfo] Do not use LLVM names to identify statements, arrays, and parameters LLVM-IR names are commonly available in debug builds, but often not in release builds. Hence, using LLVM-IR names to identify statements or memory reference results makes the behavior of Polly depend on the compile mode. This is undesirable. Hence, we now just number the statements instead of using LLVM-IR names to identify them (this issue has previously been brought up by Zino Benaissa). However, as LLVM-IR names help in making test cases more readable, we add an option '-polly-use-llvm-names' to still use LLVM-IR names. This flag is by default set in the polly tests to make test cases more readable. This change reduces the time in ScopInfo from 32 seconds to 2 seconds for the following test case provided by Eli Friedman <efriedma@codeaurora.org> (already used in one of the previous commits): struct X { int x; }; void a(); #define SIG (int x, X y, X z) typedef void (fn)SIG; #define FN { for (int i = 0; i < x; ++i) { (y)[i].x += (*z)[i].x; } a(); } #define FN5 FN FN FN FN FN #define FN25 FN5 FN5 FN5 FN5 #define FN125 FN25 FN25 FN25 FN25 FN25 #define FN250 FN125 FN125 #define FN1250 FN250 FN250 FN250 FN250 FN250 void x SIG { FN1250 } For a larger benchmark I have on-hand (10000 loops), this reduces the time for running -polly-scops from 5 minutes to 4 minutes, a reduction by 20%. The reason for this large speedup is that our previous use of printAsOperand had a quadratic cost, as for each printed and unnamed operand the full function was scanned to find the instruction number that identifies the operand. We do not need to adjust the way memory reference ids are constructured, as they do not use LLVM values. Reviewed by: efriedma Tags: #polly Differential Revision: https://reviews.llvm.org/D32789 llvm-svn: 302072	2017-05-03 20:08:52 +00:00
Tobias Grosser	72684bbaf5	[ScopInfo] Remove code not needed anymore after r302004 llvm-svn: 302005	2017-05-03 08:02:32 +00:00
Tobias Grosser	8133128c17	[ScopInfo] Do not add array name into memory reference ids Before this change a memory reference identifier had the form: <STMT>_<ACCESSTYPE><ID>_<MEMREF>, e.g., Stmt_bb9_Write0_MemRef_tmp11 After this change, we use the format: <STMT>_<ACCESSTYPE><ID>, e.g., Stmt_bb9_Write0 The name of the array that is accessed through a memory reference is not necessary to uniquely identify a memory reference, but was only added to provide additional information for debugging. We drop this information now for the following two reasons: 1) This shortens the names and consequently improves readability 2) This removes a second location where we decide on the name of a scop array, leaving us only with the location where the actual scop array is created. Having after 2) only a single location to name scop arrays will allow us to change the naming convention of scop arrays more easily, which we will do in a future commit to reduce compilation time. llvm-svn: 302004	2017-05-03 07:57:35 +00:00
Siddharth Bhat	6c3d19ba45	[NFC] [IslAST] fix typo: "int the" -> "in the" llvm-svn: 301925	2017-05-02 14:54:49 +00:00
Michael Kruse	ecbd57e98a	[CMake] Move PollyCore to Polly project folder. This keeps the artifacts consistently structured in the "Polly" folder of Visual Studio solutions. llvm-svn: 301779	2017-04-30 21:07:05 +00:00
Hongbin Zheng	e9a9932712	[Polly] Make PollyCore depends on intrinsics_gen llvm-svn: 301734	2017-04-29 03:12:17 +00:00
Tobias Grosser	f13722177b	[Codegen] Disable Polly's codegen verification by default As has been reported in the previous commit, codegen verification can result in quadratic compile time increases for large functions with many scops. This is certainly not something we would like to have in the Polly default configuration. Hence, we disable codegen verification by default -- also to see if this resolves some of the compilation timeouts we currently see on the AOSP buildbots. We still leave this feature in Polly as it has shown _very_ useful for debugging. In fact, we may want to have a discussion if we can bring this feature back in a way that does not impact compilation time so much. Thanks to Eli Friedman <efriedma@codeaurora.org> for reporting this issue and for providing the test case in the previous commit (where I forgot to acknowledge him). llvm-svn: 301670	2017-04-28 19:15:28 +00:00
Tobias Grosser	d439911f73	[CodeGen] Skip verify if -polly-codegen-verify is set to false Before this change, we always tried to verify the function and printed verification errors, but just did not abort in case -polly-codegen-verify=false was set and verification failed. As verification can become very cosly -- for large functions with many scops we may verify the very same function very often -- this can affect compile time very negatively. Hence, we respect the -polly-codegen-verify flag with this check, ensuring that no verification is run if -polly-codegen-verify=false. This reduces code generation time from 26 seconds to 4 seconds on the test case below with -polly-codegen-verify=false: struct X { int x; }; void a(); #define SIG (int x, X y, X z) typedef void (fn)SIG; #define FN { for (int i = 0; i < x; ++i) { (y)[i].x += (*z)[i].x; } a(); } #define FN5 FN FN FN FN FN #define FN25 FN5 FN5 FN5 FN5 #define FN125 FN25 FN25 FN25 FN25 FN25 #define FN250 FN125 FN125 #define FN1250 FN250 FN250 FN250 FN250 FN250 void x SIG { FN1250 } llvm-svn: 301669	2017-04-28 19:08:20 +00:00
Siddharth Bhat	abed49699b	[Polly] [PPCGCodeGeneration] Add managed memory support to GPU code generation. This needs changes to GPURuntime to expose synchronization between host and device. 1. Needs better function naming, I want a better name than "getOrCreateManagedDeviceArray" 2. DeviceAllocations is used by both the managed memory and the non-managed memory path. This exploits the fact that the two code paths are never run together. I'm not sure if this is the best design decision Reviewed by: PhilippSchaad Tags: #polly Differential Revision: https://reviews.llvm.org/D32215 llvm-svn: 301640	2017-04-28 11:16:30 +00:00
Tobias Grosser	287942ae82	Update to isl-0.18-592-gb50ad59 This is just a general maintenance update. llvm-svn: 301624	2017-04-28 06:11:17 +00:00
Tobias Grosser	c96c1d8c87	[ScopInfo] Consider only write-free dereferencable loads as invariant When we introduced in r297375 support for hoisting loads that are known to be dereferencable without any conditional guard, we forgot to keep the check to verify that no other write into the very same location exists. This change ensures now that dereferencable loads are allowed to access everything, but can only be hoisted in case no conflicting write exists. This resolves llvm.org/PR32778 Reported-by: Huihui Zhang <huihuiz@codeaurora.org> llvm-svn: 301582	2017-04-27 20:08:16 +00:00
Michael Kruse	792a6fcc57	[CMake] Use object library to build the two flavours of Polly. Polly comes in two library flavors: One loadable module to use the LLVM framework -load mechanism, and another one that host applications can link to. These have very different requirements for Polly's own dependencies. The loadable module assumes that all its LLVM dependencies are already available in the address space of the host application, and is not allowed to bring in its own copy of any LLVM library (including the NVPTX backend in case of Polly-ACC). The non-module library is intended to be linked to using target_link_libraries. CMake would then resolve all of its dependencies, including NVPTX and ensure that only a single instance of each library will be used. Differential Revision: https://reviews.llvm.org/D32442 llvm-svn: 301558	2017-04-27 16:13:03 +00:00
Hongbin Zheng	0f8f177682	[Polly] Do not introduce address space cast Do not introduce address space cast in IslNodeBuilder::preloadUnconditionally. Differential Revision: https://reviews.llvm.org/D32581 llvm-svn: 301519	2017-04-27 06:42:14 +00:00
Michael Kruse	3e519b949b	[DeLICM] Use Known information when comparing Occupied and Written. Do not conflict if a write writes the same value as already known. This change only affects unit tests, but no functional changes are expected on LLVM-IR, as no Known information is yet extracted and consequently this functionality is only triggered through unit tests. Differential Revision: https://reviews.llvm.org/D32026 llvm-svn: 301460	2017-04-26 20:35:07 +00:00
Tobias Grosser	1c3eebac08	Update to isl-0.18-423-g30331fe This is just a general maintenance update. llvm-svn: 301433	2017-04-26 17:08:02 +00:00
Michael Kruse	cd2be66bf0	[DeLICM] Use Known information when comparing Existing.Occupied and Proposed.Occupied. Do not conflict if the value of Existing and Proposed are the same. This change only affects unit tests, but no functional changes are expected on LLVM-IR, as no Known information is yet extracted and consequently this functionality is only triggered through unit tests. Differential Revision: https://reviews.llvm.org/D32025 llvm-svn: 301301	2017-04-25 10:57:32 +00:00
Siddharth Bhat	d277feda91	[PPCGCodeGeneration] Update PPCG Code Generation for OpenCL compatibility Added a small change to the way pointer arguments are set in the kernel code generation. The way the pointer is retrieved now, specifically requests global address space to be annotated. This is necessary, if the IR should be run through NVPTX to generate OpenCL compatible PTX. The changes do not affect the PTX Strings generated for the CUDA target (nvptx64-nvidia-cuda), but are necessary for OpenCL (nvptx64-nvidia-nvcl). Additionally, the data layout has been updated to what the NVPTX Backend requests/recommends. Contributed-by: Philipp Schaad Reviewers: Meinersbur, grosser, bollu Reviewed By: grosser, bollu Subscribers: jlebar, pollydev, llvm-commits, nemanjai, yaxunl, Anastasia Tags: #polly Differential Revision: https://reviews.llvm.org/D32215 llvm-svn: 301299	2017-04-25 08:08:29 +00:00
Siddharth Bhat	729377f063	[Polly] [DependenceInfo] change WAR generation, Read will not block Read Earlier, the call to buildFlow was: WAR = buildFlow(Write, Read, MustWrite, Schedule). This meant that Read could block another Read, since must-sources can block each other. Fixed the call to buildFlow to correctly compute Read. The resulting code needs to do some ISL juggling to get the output we want. Bug report: https://bugs.llvm.org/show_bug.cgi?id=32623 Reviewers: Meinersbur Tags: #polly Differential Revision: https://reviews.llvm.org/D32011 llvm-svn: 301266	2017-04-24 22:23:12 +00:00
Tobias Grosser	9b34a08b19	[isl C++ bindings] Add explicit const casts for foreach bindings This avoids a compiler warning about lost 'const' attributes. Suggested-by: Michael Kruse <llvm@meinersbur.de> llvm-svn: 301108	2017-04-23 07:54:12 +00:00
Michael Kruse	8431e996d3	[DeLICM] Use Known information when comparing Existing.Written and Proposed.Written. This change only affects unit tests, but no functional changes are expected on LLVM-IR, as no Known information is yet extracted and consequently this functionality is only triggered through unit tests. Differential Revision: https://reviews.llvm.org/D32027 llvm-svn: 300874	2017-04-20 19:16:39 +00:00
Tobias Grosser	1f8b84094f	Update isl bindings to latest version (+ Polly extensions) After the isl C++ binding generator is now close to being upstreamed to isl, we synchronize the latest changes to Polly. These are mostly formatting changes plus a small interface change for the foreach callback function and some naming changes in isl::boolean. llvm-svn: 300398	2017-04-15 08:15:54 +00:00
Tobias Grosser	75aa1a9a49	Use isl C++ foreach implementation This commit switches Polly over to the isl::obj::foreach_* implementation, which is part of the new isl bindings and follows the foreach pattern established in Polly by Michael Kruse. The original isl C function: isl_stat isl_union_set_foreach_set(__isl_keep isl_union_set uset, isl_stat (fn)(__isl_take isl_set set, void user), void user); which required the user to define a static callback function to which all interesting parameters are passed via a 'void ' user-pointer, is on the C++ side available as a function that takes a std::function<>, which can carry any additional arguments without the need for a user pointer: stat UnionSet::foreach_set(const std::function<stat(set)> &fn) const; The following code illustrates the use of the new C++ interface: auto Lambda = [=, &Result](isl::set Set) -> isl::stat { auto Shifted = shiftDimension(Set, Pos, Amount); Result = Result.add(Shifted); return isl::stat::ok; } UnionSet.foreach_set(Lambda); Polly had some specialized foreach functions which did not require the lambdas to return a status flag. We remove these functions in this commit to move Polly completely over to the new isl interface. We may in the future discuss if functors without return values can be supported easily. Another extension proposed by Michael Kruse is the use of C++ iterators to allow the use of normal for loops to iterate over these sets. Such an extension would allow us to further simplify the code. Reviewed-by: Michael Kruse <llvm@meinersbur.de> Differential Revision: https://reviews.llvm.org/D30620 llvm-svn: 300323	2017-04-14 13:39:40 +00:00
Michael Kruse	72f3922534	[DeLICM] Export Known and Written to DeLICMTests. NFC. This will allow unittesting of new functionality based on Known and Written. llvm-svn: 300211	2017-04-13 16:32:39 +00:00
Michael Kruse	a2acc11949	[DeLICM] Add Knowledge::Known. NFC. This field will later contain a ValInst that is known to be stored in an occupied array element. llvm-svn: 300210	2017-04-13 16:32:31 +00:00
Michael Kruse	fa7c8cdfc6	[DeLICM] Make Knowledge::Written an isl::union_map. NFC. The map will later point to a ValInst that is written. llvm-svn: 300208	2017-04-13 16:32:25 +00:00
Tobias Grosser	7b5a4dfd46	Exploit BasicBlock::getModule to shorten code Suggested-by: Roman Gareev <gareevroman@gmail.com> llvm-svn: 299914	2017-04-11 04:59:13 +00:00
Tobias Grosser	67726b3260	SAdjust to recent change in constructor definition of AllocaInst llvm-svn: 299913	2017-04-11 04:23:38 +00:00
Matt Arsenault	b3e30c32ce	Update for alloca construction changes llvm-svn: 299905	2017-04-11 00:12:58 +00:00
Roman Gareev	9d4d91ca6a	[FIX] Fix ScheduleTreeOptimizer::optimizeMatMulPattern Use new values of the dimensions during their permutation. llvm-svn: 299663	2017-04-06 17:25:08 +00:00
Roman Gareev	e0d466342b	Restore the initial ordering of dimensions before applying the pattern matching Dimensions of band nodes can be implicitly permuted by the algorithm applied during the schedule generation. For example, in case of the following matrix-matrix multiplication, for (i = 0; i < 1024; i++) for (k = 0; k < 1024; k++) for (j = 0; j < 1024; j++) C[i][j] += A[i][k] * B[k][j]; it can produce the following schedule tree domain: "{ Stmt_for_body6[i0, i1, i2] : 0 <= i0 <= 1023 and 0 <= i1 <= 1023 and 0 <= i2 <= 1023 }" child: schedule: "[{ Stmt_for_body6[i0, i1, i2] -> [(i0)] }, { Stmt_for_body6[i0, i1, i2] -> [(i1)] }, { Stmt_for_body6[i0, i1, i2] -> [(i2)] }]" permutable: 1 coincident: [ 1, 1, 0 ] The current implementation of the pattern matching optimizations relies on the initial ordering of dimensions. Otherwise, it can produce the miscompilation (e.g., [1]). This patch helps to restore the initial ordering of dimensions by recreating the band node when the corresponding conditions are satisfied. Refs.: [1] - https://bugs.llvm.org/show_bug.cgi?id=32500 Reviewed-by: Michael Kruse <llvm@meinersbur.de> Differential Revision: https://reviews.llvm.org/D31741 llvm-svn: 299662	2017-04-06 17:09:54 +00:00
Siddharth Bhat	5eeb1dd42e	[Polly] [ScheduleOptimizer] Prevent incorrect tile size computation Because Polly exposes parameters that directly influence tile size calculations, one can setup situations like divide-by-zero. Check against a possible divide-by-zero in getMacroKernelParams and return early. Also assert at the end of getMacroKernelParams that the block sizes computed for matrices are positive (>= 1). Tags: #polly Differential Revision: https://reviews.llvm.org/D31708 llvm-svn: 299633	2017-04-06 08:20:22 +00:00
Tobias Grosser	0d622a4bf9	Update to isl-0.18-417-gb9e7334 This is a regular maintenance update. llvm-svn: 299617	2017-04-06 03:41:47 +00:00
Michael Kruse	895f5d8080	Remove llvm.lifetime.start/end in original region. The current StackColoring algorithm does not correctly handle the situation when some, but not all paths from a BB to the entry node cross a llvm.lifetime.start. According to an interpretation of the language reference at http://llvm.org/docs/LangRef.html#llvm-lifetime-start-intrinsic this might be correct, but it would cost too much effort to handle in StackColoring. To be on the safe side, remove all lifetime markers even in the original code version (they have never been copied to the optimized version) to ensure that no path to the entry block will cross a llvm.lifetime.start. The same principle applies to paths the a function return and the llvm.lifetime.end marker, so we remove them as well. This fixes llvm.org/PR32251. Also see the discussion at http://lists.llvm.org/pipermail/llvm-dev/2017-March/111551.html llvm-svn: 299585	2017-04-05 20:09:59 +00:00
Siddharth Bhat	bcbfdade41	[Polly] [DependenceInfo] change WAR, WAW generation to correct semantics = Change of WAR, WAW generation: = - `buildFlow(Sink, MustSource, MaySource, Sink)` treates any flow of the form `sink <- may source <- must source` as a may dependence. - we used to call: ```lang=cpp, name=old-flow-call.cpp Flow = buildFlow(MustWrite, MustWrite, Read, Schedule); WAW = isl_union_flow_get_must_dependence(Flow); WAR = isl_union_flow_get_may_dependence(Flow); ``` - This caused some WAW dependences to be treated as WAR dependences. - Incorrect semantics. - Now, we call WAR and WAW correctly. == Correct WAW: == ```lang=cpp, name=new-waw-call.cpp Flow = buildFlow(Write, MustWrite, MayWrite, Schedule); WAW = isl_union_flow_get_may_dependence(Flow); isl_union_flow_free(Flow); ``` == Correct WAR: == ```lang=cpp, name=new-war-call.cpp Flow = buildFlow(Write, Read, MustaWrite, Schedule); WAR = isl_union_flow_get_must_dependence(Flow); isl_union_flow_free(Flow); ``` - We want the "shortest" WAR possible (exact dependences). - We mark all the must-writes as may-source, reads as must-souce. - Then, we ask for must dependence. - This removes all the reads that flow through a must-write before reaching a sink. - Note that we only block ealier writes with must-writes. This is intuitively correct, as we do not want may-writes to block must-writes. - Leaves us with direct (R -> W). - This affects reduction generation since RED is built using WAW and WAR. = New StrictWAW for Reductions: = - We used to call: ```lang=cpp,name=old-waw-war-call.cpp Flow = buildFlow(MustWrite, MustWrite, Read, Schedule); WAW = isl_union_flow_get_must_dependence(Flow); WAR = isl_union_flow_get_may_dependence(Flow); ``` - This is the right model of WAW we need for reductions, just not in general. - Reductions need to track only strict WAW, without any interfering reductions. = Explanation: Why the new WAR dependences in tests are correct: = - We no longer set WAR = WAR - WAW - Hence, we will have WAR dependences that were originally removed. - These may look incorrect, but in fact make sense. == Code: == ```lang=llvm, name=new-war-dependence.ll ; void manyreductions(long A) { ; for (long i = 0; i < 1024; i++) ; for (long j = 0; j < 1024; j++) ; S0: A += 42; ; ; for (long i = 0; i < 1024; i++) ; for (long j = 0; j < 1024; j++) ; S1: A += 42; ; ``` === WAR dependence: === { S0[1023, 1023] -> S1[0, 0] } - Between `S0[1023, 1023]` and `S1[0, 0]`, we will have the dependences: ```lang=cpp, name=dependence-incorrect, counterexample S0[1023, 1023]: -- tmp = A (load0)-- WAR 2 add = tmp + 42 \| -> A = add (store0) \| WAR 1 S1[0, 0]: \| tmp = A (load1) \| add = tmp + 42 \| A = add (store1)<- ``` - One may assume that WAR2 hides WAR1 (since store0 happens before store1). However, within a statement, Polly has no idea about the ordering of loads and stores. - Hence, according to Polly, the code may have looked like this: ```lang=cpp, name=dependence-correct S0[1023, 1023]: A = add (store0) tmp = A (load0) ---* add = A + 42 \| WAR 1 S1[0, 0]: \| tmp = A (load1) \| add = A + 42 \| A = add (store1) <-* ``` - So, Polly generates (correct) WAR dependences. It does not make sense to remove these dependences, since they are correct with respect to Polly's model. Reviewers: grosser, Meinersbur tags: #polly Differential revision: https://reviews.llvm.org/D31386 llvm-svn: 299429	2017-04-04 13:08:23 +00:00
Philip Pfaffe	447f175eb5	Fix formatting in LoopGenerators llvm-svn: 299424	2017-04-04 10:22:17 +00:00
Philip Pfaffe	2d950f36ee	[Polly][NewPM] Pull references to the legacy PM interface from utilities and helpers Summary: A couple of the utilities used to analyze or build IR make explicit use of the legacy PM on their interface, to access analysis results. This patch removes the legacy PM from the interface, and just passes the required results directly. This shouldn't introduce any function changes, although the API technically allowed to obtain two different analysis results before, one passed by reference and one through the PM. I don't believe that was ever intended, however. Reviewers: grosser, Meinersbur Reviewed By: grosser Subscribers: nemanjai, pollydev, llvm-commits Tags: #polly Differential Revision: https://reviews.llvm.org/D31653 llvm-svn: 299423	2017-04-04 10:01:53 +00:00
Tobias Grosser	637be04b77	[PerfMonitor] Use Intrinsics::getDeclaration Instead of creating the declaration ourselves, we obtain it directly from the LLVM intrinsic definitions. This addresses a post-review comment for r299359. Suggested-by: Hongzing Zheng <etherzhhb@gmail.com> llvm-svn: 299360	2017-04-03 15:23:08 +00:00
Tobias Grosser	65371af2e1	[CodeGen] Add Performance Monitor Add support for -polly-codegen-perf-monitoring. When performance monitoring is enabled, we emit performance monitoring code during code generation that prints after program exit statistics about the total number of cycles executed as well as the number of cycles spent in scops. This gives an estimate on how useful polyhedral optimizations might be for a given program. Example output: Polly runtime information ------------------------- Total: 783110081637 Scops: 663718949365 In the future, we might also add functionality to measure how much time is spent in optimized scops and how many cycles are spent in the fallback code. Reviewers: bollu,sebpop Tags: #polly Differential Revision: https://reviews.llvm.org/D31599 llvm-svn: 299359	2017-04-03 14:55:37 +00:00
Michael Kruse	6e7854a560	[ScopInfo] Fix typos in option description. llvm-svn: 299356	2017-04-03 12:03:38 +00:00
Tobias Grosser	696a1ee99d	[PollyIRBuilder] Bound size of alias metadata No-alias metadata grows quadratic in the size of arrays involved, which can become very costly for large programs. This commit bounds the number of arrays for which we construct no-alias information to ten. This is conservatively correct, as we just provide less information to LLVM and speeds up the compile time of one of my internal test cases from 'does-not-terminate' to 'finishes-in-less-than-a-minute'. In the future we might try to be more clever here, but this change should provide a good baseline. llvm-svn: 299352	2017-04-03 07:42:50 +00:00
Tobias Grosser	af940ae280	Update to isl-0.18-410-gc253447 This is a regular maintenance update to ensure latest isl changes are tested in our buildbots. llvm-svn: 299350	2017-04-03 06:46:16 +00:00
Huihui Zhang	d6d6a3f2ee	revert test commit r299024 llvm-svn: 299026	2017-03-29 20:23:56 +00:00
Huihui Zhang	9d19e9d232	test commit, add blank line llvm-svn: 299024	2017-03-29 20:10:45 +00:00
Michael Kruse	c3e9c1442d	[ScopInfo] Introduce ScopStmt::contains(BB*). NFC. Provide an common way for testing if a statement contains something for region and block statements. First user is RegionGenerator::addOperandToPHI. Suggested-by: Tobias Grosser <tobias@grosser.es> llvm-svn: 298617	2017-03-23 16:12:21 +00:00
Tobias Grosser	1f7e7d3d93	Update to isl-0.18-402-ga30c537 This is a regular maintenance update. llvm-svn: 298595	2017-03-23 13:38:24 +00:00
Michael Kruse	9e4e7b467f	[DeLICM] Add const qualifiers. NFC. llvm-svn: 298546	2017-03-22 20:09:58 +00:00
Michael Kruse	174f483990	[Support] Add functions to ISLTools. Add shiftDim and convertZoneToTimepoints overloads for isl maps. Add distributeDomain, liftDomains and applyDomainRange functions. These are going to be used in https://reviews.llvm.org/D31247 (Add known array contents to Knowledge) llvm-svn: 298543	2017-03-22 19:31:06 +00:00
Michael Kruse	d07d155ebb	[DeLICM] Remove overloaded Knowledge constructor. NFC. The isl C++ bindings now has implicit conversions from isl::set to isl::union_set. Therefore the additional overload accepting isl::set is not required anymore. llvm-svn: 298529	2017-03-22 18:01:23 +00:00
Michael Kruse	29143ec3f7	[DeLICM] Remove AllElements. NFC. It is not used and will not be used (anymore) in future commits. llvm-svn: 298522	2017-03-22 17:18:39 +00:00
Roman Gareev	cdfb57dc46	Introduce another level of metadata to distinguish non-aliasing accesses Introduce another level of alias metadata to distinguish the individual non-aliasing accesses that have inter iteration alias-free base pointers marked with "Inter iteration alias-free" mark nodes. It can be used to, for example, distinguish different stores (loads) produced by unrolling of the innermost loops and, subsequently, sink (hoist) them by LICM. Reviewed-by: Tobias Grosser <tobias@grosser.es> Differential Revision: https://reviews.llvm.org/D30606 llvm-svn: 298510	2017-03-22 14:25:24 +00:00
Roman Gareev	23df27682a	Map the new load to the base pointer of the invariant load hoisted load Map the new load to the base pointer of the invariant load hoisted load to be able to find the alias information for it. Reviewed-by: Tobias Grosser <tobias@grosser.es> Differential Revision: https://reviews.llvm.org/D30605 llvm-svn: 298507	2017-03-22 13:57:53 +00:00
Siddharth Bhat	44b6cb4e63	[DependenceInfo] change name Write to MustWrite to remove ambiguity [NFC] "Write" is an overloaded term. In collectInfo() till buildFlow(), it is used to mean "must writes". However, within the memory based analysis, it is used to mean "both may and must writes". Renaming the Write variable helps clarify this difference. Reviewers: grosser Tags: #polly Differential Revision: https://reviews.llvm.org/D31181 llvm-svn: 298361	2017-03-21 11:54:08 +00:00
Tobias Grosser	29eaa16b7e	Update isl to isl-0.18-395-g77701b3 This is a normal maintenance update. llvm-svn: 298352	2017-03-21 09:12:11 +00:00
Tobias Grosser	b28f86e9e6	[CodeGen] Remove need for all parameters to be in scop context for load hoisting. When not adding constraints on parameters using -polly-ignore-parameter-bounds, the context may not necessarily list all parameter dimensions. To support code generation in this situation, we now always iterate over the actual parameter list, rather than relying on the context to list all parameter dimensions. llvm-svn: 298197	2017-03-18 23:12:49 +00:00
Tobias Grosser	1be726a40d	[IslExprBuilder] Print accessed memory locations with RuntimeDebugBuilder After this change, enabling -polly-codegen-add-debug-printing in combination with -polly-codegen-generate-expressions allows us to instrument the compiled binaries to not only print the values stored and loaded to a given memory access, but also to print the accessed location with array name and per-dimension offset: MemRef_A[3][2] Store to 6299784: 5.000000 MemRef_A[3][3] Load from 6299788: 0.000000 MemRef_A[3][3] Store to 6299788: 6.000000 This can be very helpful for debugging. llvm-svn: 298194	2017-03-18 20:54:43 +00:00
Tobias Grosser	7693b116a1	[OpenMP] Do not emit lifetime markers for context In commit r219005 lifetime markers have been introduced to mark the lifetime of the OpenMP context data structure. However, their use seems incorrect and recently caused a miscompile in ASC_Sequoia/CrystalMk after r298053 which was not at all related to r298053. r298053 only caused a change in the loop order, as this change resulted in a different isl internal representation which caused the scheduler to derive a different schedule. This change then caused the IR to change, which apparently created a pattern in which LLVM exploites the lifetime markers. It seems we are using the OpenMP context outside of the lifetime markers. Even though CrystalMk could probably be fixed by expanding the scope of the lifetime markers, it is not clear what happens in case the OpenMP function call is in a loop which will cause a sequence of starting and ending lifetimes. As it is unlikely that the lifetime markers give any performance benefit, we just drop them to remove complexity. llvm-svn: 298192	2017-03-18 20:10:07 +00:00
Siddharth Bhat	3e4a7d38ab	[ScheduleOptimiser] fix typos in top comment [NFC] coice -> choice Transations -> Transactions llvm-svn: 298095	2017-03-17 14:52:19 +00:00
Michael Kruse	89b1f94e64	Revert "Remove references to AssumptionCache. NFC." The AssumptionCache removal of r289756 has been reverted in r290086/r290087. A different solution has been implemented in r291671 which keeps the AssumptionCache. We can therefore use it again in Polly. This reverts r289791. llvm-svn: 298089	2017-03-17 13:56:53 +00:00
Siddharth Bhat	4fe11cf95f	[DependenceInfo] Remove idempotent union: must-writes with may-writes [NFC] Since may-writes are always a superset of the must-writes, there is no point in taking a union of one with the other. llvm-svn: 298085	2017-03-17 13:26:10 +00:00
Michael Kruse	9b91c62e3a	[ScopInfo/PruneUnprofitable] Move default profitability check. In the previous default ScopInfo applied the profitability heuristic for scalar accesses (-polly-unprofitable-scalar-accs=true) and the -polly-prune-unprofitable was disabled by default (-polly-enable-prune-unprofitable=false) as that pruning was already done. This changes switches the defaults to -polly-unprofitable-scalar-accs=true -polly-enable-prune-unprofitable=false such that the scalar access heuristic check is done by the pass. This allows passes between ScopInfo and PruneUnprofitable to optimize away scalar accesses. Without enabling such intermediate passes, there is no change in behaviour of profitability checks in a PassManagerBuilder built pass chain, but it allows us to cover this configuration with the buildbots. Suggested-by: Tobias Grosser <tobias@grosser.es> llvm-svn: 298081	2017-03-17 13:10:05 +00:00
Michael Kruse	f3091bf4cf	[PruneUnprofitable] Add -polly-prune-unprofitable pass. ScopInfo's normal profitability heuristic considers SCoPs where all statements have scalar writes as not profitably optimizable and invalidate the SCoP in that case. However, -polly-delicm and -polly-simplify may be able to remove some of the scalar writes such that the flag -polly-unprofitable-scalar-accs=false allows disabling that part of the heuristic. In cases where DeLICM (or other passes after ScopInfo) are not successful in removing scalar writes, the SCoP is still not profitably optimizable. The schedule optimizer would again try computing another schedule, resulting in slower compilation. The -polly-prune-unprofitable pass applies the profitability heuristic again before the schedule optimizer Polly can still bail out even with -polly-unprofitable-scalar-accs=false. Differential Revision: https://reviews.llvm.org/D31033 llvm-svn: 298080	2017-03-17 13:09:52 +00:00
Tobias Grosser	5842dee251	[ScopInfo] Add option to not add parameter bounds to context [NFC] For experiments it is sometimes helpful to provide parameter bound information to polly and to not use these parameter bounds for simplification. Add a new option "-polly-ignore-parameter-bounds" which does precisely this. llvm-svn: 298077	2017-03-17 13:00:53 +00:00
Siddharth Bhat	db5dd14cbb	[DependenceInfo] Replace use of deprecated isl_dim_n_out [NFC] Change isl_dim_n_out to isl_map_dim(*, isl_dim_out) llvm-svn: 298075	2017-03-17 12:59:01 +00:00
Siddharth Bhat	65f3d5201e	[DependenceInfo] Track may-writes and build flow information in Dependences::calculateDependences. This ensures that we handle may-writes correctly when building dependence information. Also add a test case checking correctness of may-write information. Not handling it before was an oversight. Differential Revision: https://reviews.llvm.org/D31075 llvm-svn: 298074	2017-03-17 12:31:28 +00:00
Tobias Grosser	8a6e605e96	[ScopInfo] Do not take inbounds assumptions [NFC] For experiments it is sometimes helpful to not take any inbounds assumptions. Add a new option "-polly-ignore-inbounds" which does precisely this. llvm-svn: 298073	2017-03-17 12:26:58 +00:00
Tobias Grosser	b58ed8d3cd	[ScopInfo] Do not try to eliminate parameter dimensions that do not exist In subsequent changes we will make Polly a little bit more lazy in adding parameter dimensions to different sets. As a result, not all parameters will always be part of the parameter space. This change ensures that we do not use the '-1' returned when a parameter dimension cannot be found, but instead just do not try to eliminate the anyhow non-existing dimension. llvm-svn: 298054	2017-03-17 09:02:53 +00:00
Tobias Grosser	941cb7d979	[ScopInfo] Do not expand getDomains() to full parameter space. Since several years, isl can perform most operations on sets with differing parameter spaces, by expanding the parameter space on demand relying using named isl ids to distinguish different parameter dimensions. By not always expanding to full dimensionality the set remain smaller and can likely be operated on faster. This change by itself did not yet result in measurable performance benefits, but it is a step into the right direction needed to ensure that subsequent changes indeed can work with lower-dimensional sets and these sets do not get blown up by accident when later intersected with the domain context. llvm-svn: 298053	2017-03-17 09:02:50 +00:00
Tobias Grosser	f4fe34bfb8	Update to isl-0.18-387-g3fa6191 This is a normal / regular maintenance update. llvm-svn: 297999	2017-03-16 21:33:20 +00:00
Siddharth Bhat	65c4026992	Set Dependences::RED to be non-null once Dependences::calculateDependences() occurs, even if there is no actual reduction. This ensures correctness with isl operations. llvm-svn: 297981	2017-03-16 20:06:49 +00:00
Michael Kruse	5545407fa4	[ScopInfo] Introduce ScopStmt::getSurroundingLoop(). NFC. Introduce ScopStmt::getSurroundingLoop() to replace getFirstNonBoxedLoopFor. getSurroundingLoop() returns the precomputed surrounding/first non-boxed loop. Except in ScopDetection, the list of boxed loops is only used to get the surrounding loop. getFirstNonBoxedLoopFor also requires LoopInfo at every use which is not necessarily available everywhere where we may want to use it. Differential Revision: https://reviews.llvm.org/D30985 llvm-svn: 297899	2017-03-15 22:16:43 +00:00
Tobias Grosser	d614b3e6bd	Preserve the isl-noexceptions.h C++ bindings when updating isl The bindings currently need to be generated manually, as they are not yet part of the official isl distribution. Hence, we keep them across updates assuming they only need to be updated when new functions or functionality should be exposed. llvm-svn: 297710	2017-03-14 07:46:28 +00:00
Tobias Grosser	9c19a0e16a	Add back header file that was accidentally dropped in previous update llvm-svn: 297709	2017-03-14 07:39:05 +00:00
Tobias Grosser	593ebdfbd1	Update to isl-0.18-369-g5e613c6 This is a regular maintenance update. llvm-svn: 297708	2017-03-14 07:33:26 +00:00
Tobias Grosser	c9d4cb2f42	[ScheduleOptimizer] Allow tiling after fusion In ScheduleOptimizer::isTileableBand(), allow the case in which the band node's child is an isl_schedule_sequence_node and its grandchildren isl_schedule_leaf_nodes. This case can arise when two or more statements are fused by the isl scheduler. The tile_after_fusion.ll test has two statements in separate loop nests and checks whether they are tiled after being fused when polly-opt-fusion equals "max". Reviewers: grosser Subscribers: gareevroman, pollydev Tags: #polly Contributed-by: Theodoros Theodoridis <theodort@student.ethz.ch> Differential Revision: https://reviews.llvm.org/D30815 llvm-svn: 297587	2017-03-12 19:02:31 +00:00
Tobias Grosser	de244eb450	Possible error in doc comment If a SCoP is most probably sequential, then it's better to run it on a CPU. Hence, there's no point in running it on a GPU. Reviewers: grosser Subscribers: nemanjai Tags: #polly Contributed-by: Singapuram Sanjay <singapuram.sanjay@gmail.com> Differential Revision: https://reviews.llvm.org/D30864 llvm-svn: 297578	2017-03-12 08:19:01 +00:00
Tobias Grosser	b2347dc241	[isl++] Add missing /* implicit */ marker llvm-svn: 297577	2017-03-12 08:17:50 +00:00
Tobias Grosser	5ac963743f	[isl++] Add last set of missing isl:: prefixes to increase consistency [NFC] llvm-svn: 297558	2017-03-11 07:58:12 +00:00
Tobias Grosser	d67d368e12	[isl++] Add namespace prefixes to isl::ctx and isl::stat These were missed in r297478. We add them for consistency. llvm-svn: 297520	2017-03-10 22:10:19 +00:00
Tobias Grosser	30a06dce68	[isl++] Drop warning about experimental status As most discussions about these bindings have concluded and only the final patch review on the isl mailing list is missing, we drop the experimental warning tag to match the patchset we will submit to isl, which is expected to not change notably any more. llvm-svn: 297519	2017-03-10 22:10:15 +00:00
Tobias Grosser	9839774e5d	[isl++] Do not use enum prefix Instead of declaring a function as: inline val plain_get_val_if_fixed(enum dim type, unsigned int pos) const; we use: inline isl::val plain_get_val_if_fixed(isl::dim type, unsigned int pos) const; The first argument caused the following compile time error on windows: "error C3431: 'dim': a scoped enumeration cannot be redeclared as an unscoped enumeration" In some cases it is sufficient to just drop the 'enum' prefix, but for example for isl::set the 'enum class dim' type collides with the function name isl::set::dim and can consequently not be referenced. To avoid such kind of ambiguities in the future we add the isl:: prefix consistently to all types used. Reported-by: Michael Kruse <llvm@meinersbur.de> llvm-svn: 297478	2017-03-10 17:01:30 +00:00
Michael Kruse	0446d81e2d	[Simplify] Add -polly-simplify pass. This new pass removes unnecessary accesses and writes. It currently supports 2 simplifications, but more are planned. It removes write accesses that write a loaded value back to the location it was loaded from. It is a typical artifact from DeLICM. Removing it will get rid of bogus dependencies later in dependency analysis. It also removes statements without side-effects. ScopInfo already removes these, but the removal of unnecessary writes can result in more side-effect free statements. Differential Revision: https://reviews.llvm.org/D30820 llvm-svn: 297473	2017-03-10 16:05:24 +00:00
Tobias Grosser	3e618c33fe	[DeadCodeElimination] Translate to C++ bindings This pass is a small and self-contained example of a piece of code that was written with the isl C interface. The diff of this change nicely shows how the C++ bindings can improve the readability of the code by avoiding the long C function names and by avoiding any need for memory management. As you will see, no calls to isl__copy or isl__free are needed anymore. Instead the C++ interface takes care of automatically managing the objects. This may introduce internally additional copies, but due to the isl reference counting, such copies are expected to be cheap. For performance critical operations, we will later exploit move semantics to eliminate unnecessary copies that have shown to be costly. Below we give a set of examples that shows the benefit of the C++ interface vs. the pure C interface. Check properties ---------------- Before: if (isl_aff_is_zero(aff) \|\| isl_aff_is_one(aff)) return true; After: if (Aff.is_zero() \|\| Aff.is_one()) return true; Type conversion --------------- Before: isl_union_pw_multi_aff UPMA = isl_union_pw_multi_aff_from_union_map(umap); After: isl::union_pw_multi_aff UPMA = UMap; Type construction ----------------- Before: auto Empty = isl_union_map_empty(space); After: auto Empty = isl::union_map::empty(Space); Operations ---------- Before: set = isl_union_set_intersect(set, set2); After: Set = Set.intersect(Set2); The use of isl::boolean in return types also adds an increases the robustness of Polly, as on conversion to true or false, we verify that no isl_bool_error has been returned and assert in case an error was returned. Before this change we would have just ignored the error and proceeded with (some) exection path. Tags: #polly Reviewed By: Meinersbur Differential Revision: https://reviews.llvm.org/D30619 llvm-svn: 297466	2017-03-10 15:05:38 +00:00
Tobias Grosser	51ebda8c9d	[FlattenAlgo] Translate to C++ bindings Translate the full algorithm to use the new isl C++ bindings This is a large piece of code that has been written with the Polly IslPtr<> memory management tool, which only performed memory management, but did not provide a method interface. As such the code was littered with calls to give(), copy(), keep(), and take(). The diff of this change should give a good example how the new method interface simplifies the code by removing the need for switching between managed types and C functions all the time and consequently also the need to use the long C function names. These are a couple of examples comparing the old IslPtr memory management interface with the complete method interface. Check properties ---------------- Before: if (isl_aff_is_zero(Aff.get()) \|\| isl_aff_is_one(Aff.get())) return true; After: if (Aff.is_zero() \|\| Aff.is_one()) return true; Type conversion --------------- Before: isl_union_pw_multi_aff *UPMA = give(isl_union_pw_multi_aff_from_union_map(UMap.copy()); After: isl::union_pw_multi_aff UPMA = UMap; Type construction ----------------- Before: auto Empty = give(isl_union_map_empty(Space.copy()); After: auto Empty = isl::union_map::empty(Space); Operations ---------- Before: Set = give(isl_union_set_intersect(Set.copy(), Set2.copy()); After: Set = Set.intersect(Set2); Tags: #polly Reviewed By: Meinersbur Differential Revision: https://reviews.llvm.org/D30617 llvm-svn: 297463	2017-03-10 14:55:58 +00:00
Tobias Grosser	4c24e57965	Add method interface to isl C++ bindings The isl C++ binding method interface introduces a thin C++ layer that allows to call isl methods directly on the memory managed C++ objects. This makes the relevant methods directly available via code-completion interfaces, allows for the use of overloading, conversion constructors, and many other nice C++ features that make using isl a lot easier. The individual features will be highlighted in the subsequent commits. Tags: #polly Reviewed By: Meinersbur Differential Revision: https://reviews.llvm.org/D30616 llvm-svn: 297462	2017-03-10 14:53:00 +00:00
Tobias Grosser	deaef15f52	Introduce isl C++ bindings, Part 1: value_ptr style interface Over the last couple of months several authors of independent isl C++ bindings worked together to jointly design an official set of isl C++ bindings which combines their experience in developing isl C++ bindings. The new bindings have been designed around a value pointer style interface and remove the need for explicit pointer managenent and instead use C++ language features to manage isl objects. This commit introduces the smart-pointer part of the isl C++ bindings and replaces the current IslPtr<T> classes, which served the very same purpose, but had to be manually maintained. Instead, we now rely on automatically generated classes for each isl object, which provide value_ptr semantics. An isl object has the following smart pointer interface: inline set manage(__isl_take isl_set ptr); class set { friend inline set manage(__isl_take isl_set ptr); isl_set ptr = nullptr; inline explicit set(__isl_take isl_set ptr); public: inline set(); inline set(const set &obj); inline set &operator=(set obj); inline ~set(); inline __isl_give isl_set copy() const &; inline __isl_give isl_set copy() && = delete; inline __isl_keep isl_set get() const; inline __isl_give isl_set release(); inline bool is_null() const; } The interface and behavior of the new value pointer style classes is inspired by http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2012/n3339.pdf, which proposes a std::value_ptr, a smart pointer that applies value semantics to its pointee. We currently only provide a limited set of public constructors and instead require provide a global overloaded type constructor method "isl::obj isl::manage(isl_obj )", which allows to convert an isl_set to an isl::set by calling 'S = isl::manage(s)'. This pattern models the make_unique() constructor for unique pointers. The next two functions isl::obj::get() and isl::obj::release() are taken directly from the std::value_ptr proposal: S.get() extracts the raw pointer of the object managed by S. S.release() extracts the raw pointer of the object managed by S and sets the object in S to null. We additionally add std::obj::copy(). S.copy() returns a raw pointer refering to a copy of S, which is a shortcut for "isl::obj(oldobj).release()", a functionality commonly needed when interacting directly with the isl C interface where all methods marked with __isl_take require consumable raw pointers. S.is_null() checks if S manages a pointer or if the managed object is currently null. We add this function to provide a more explicit way to check if the pointer is empty compared to a direct conversion to bool. This commit also introduces a couple of polly-specific extensions that cover features currently not handled by the official isl C++ bindings draft, but which have been provided by IslPtr<T> and are consequently added to avoid code churn. These extensions include: - operator bool() : Conversion from objects to bool - construction from nullptr_t - get_ctx() method - take/keep/give methods, which match the currently used naming convention of IslPtr<T> in Polly. They just forward to (release/get/manage). - raw_ostream printers We expect that these extensions are over time either removed or upstreamed to the official isl bindings. We also export a couple of classes that have not yet been exported in isl (e.g., isl::space) As part of the code review, the following two questions were asked: - Why do we not use a standard smart pointer? std::value_ptr was a proposal that has not been accepted. It is consequently not available in the standard library. Even if it would be available, we want to expand this interface with a complete method interface that is conveniently available from each managed pointer. The most direct way to achieve this is to generate a specialiced value style pointer class for each isl object type and add any additional methods to this class. The relevant changes follow in subsequent commits. - Why do we not use templates or macros to avoid code duplication? It is certainly possible to use templates or macros, but as this code is auto-generated there is no need to make writing this code more efficient. Also, most of these classes will be specialized with individual member functions in subsequent commits, such that there will be little code reuse to exploit. Hence, we decided to do so at the moment. These bindings are not yet officially part of isl, but the draft is already very stable. The smart pointer interface itself did not change since serveral months. Adding this code to Polly is against our normal policy of only importing official isl code. In this case however, we make an exception to showcase a non-trivial use case of these bindings which should increase confidence in these bindings and will help upstreaming them to isl. Tags: #polly Reviewed By: Meinersbur Differential Revision: https://reviews.llvm.org/D30325 llvm-svn: 297452	2017-03-10 11:41:03 +00:00
Tobias Grosser	e5671e54c0	Update to isl-0.18-356-g0b05d01 This is a regular maintenance update. llvm-svn: 297449	2017-03-10 09:17:55 +00:00
Michael Kruse	e4292bf086	[Support] Add -polly-dump-module pass. This pass allows writing the LLVM-IR just before and after the Polly passes to a file. Dumping the IR before Polly helps reproducing bugs that occur in code generated by clang. It is the only reliable way to get the IR that triggers a bug. The alternative is to emit the IR with clang -c -emit-llvm -S -o dump.ll then pass it through all optimization passes opt dump.ll -basicaa -sroa ... -S -o optdump.ll to then reproduce the error with opt optdump.ll -polly-opt-isl -polly-codegen -analyze However, the IR is not the same. -O3 uses a PassBuilder than creates passes with different parameters than the default. Dumping the IR after Polly is useful to compare a miscompilation with a known-good configuration. Differential Revision: https://reviews.llvm.org/D30788 llvm-svn: 297415	2017-03-09 22:29:58 +00:00
Tobias Grosser	8bd7f3c0a5	[ScopDetect/Info] Allow unconditional hoisting of loads from dereferenceable ptrs In case LLVM pointers are annotated with !dereferencable attributes/metadata or LLVM can look at the allocation from which a pointer is derived, we can know that dereferencing pointers is safe and can be done unconditionally. We use this information to proof certain pointers as save to hoist and then hoist them unconditionally. llvm-svn: 297375	2017-03-09 11:36:00 +00:00
Michael Kruse	9fb3ab1b19	[DeLICM] Add -polly-delicm-overapproximate-writes option. One of the current limitations of DeLICM is that it only creates PHI WRITEs that it knows are read by some PHI. Such writes may not span all instances of a statement. Polly's code generator currently does not support MemoryAccesses that are not executed in all instances ('partial accesses') and so has to give up on a possible mapping. This workaround has once been suggested by Tobias Grosser: Try to interpolate an arbitrary expansion to all instances. It will be checked for possible conflicts with the existing Knowledge and can be applied if the conflict checking result is that no semantics are changed. Expansion is done by simplifying the mapping by coalescing with the hope that coalescing will find a polyhedral 'rule' of the relevant map. It is then 'gist'-ed using the domain of the relevant instances such that the rule is expanded to the universe and finally intersected with the domain of all statement instances. The expansion makes conflicts become more likely, the found rule may still not encompass all statement instances and the found rule exposes internals of isl's implementation of coalesce and gist. The latter means that the result depends on how much effort the implementation invests into finding a rule which may change between versions of isl. Trivial implementations of gist and coalesce just return the input arguments. A patch that makes codegen support partial accesses is in preparation as well. Differential Revision: https://reviews.llvm.org/D30763 llvm-svn: 297373	2017-03-09 11:23:22 +00:00
Michael Kruse	935b2a3654	[DeadCodeElim] Put -polly-dce-precise-steps into the Polly category. llvm-svn: 297318	2017-03-08 23:25:35 +00:00
Michael Kruse	6744efa8d8	[ScopDetection] Only allow SCoP-wide available base pointers. Simplify ScopDetection::isInvariant(). Essentially deny everything that is defined within the SCoP and is not load-hoisted. The previous understanding of "invariant" has a few holes: - Expressions without side-effects with only invariant arguments, but are defined withing the SCoP's region with the exception of selects and PHIs. These should be part of the index expression derived by ScalarEvolution and not of the base pointer. - Function calls with that are !mayHaveSideEffects() (typically functions with "readnone nounwind" attributes). An example is given below. @C = external global i32 declare float* @getNextBasePtr(float) readnone nounwind ... %ptr = call float @getNextBasePtr(float* %A, float %B) The call might return: * %A, so %ptr aliases with it in the SCoP * %B, so %ptr aliases with it in the SCoP * @C, so %ptr aliases with it in the SCoP * a new pointer everytime it is called, such as malloc() * a pointer into the allocated block of one of the aforementioned * any of the above, at random at each call Hence and contrast to a comment in the base_pointer.ll regression test, %ptr is not necessarily the same all the time. It might also alias with anything and no AliasAnalysis can tell otherwise if the definition is external. It is hence not suitable in the role of a base pointer. The practical problem with base pointers defined in SCoP statements is that it is not available globally in the SCoP. The statement instance must be executed first before the base pointer can be used. This is no problem if the base pointer is transferred as a scalar value between statements. Uses of MemoryAccess::setNewAccessRelation may add a use of the base pointer anywhere in the array. setNewAccessRelation is used by JSONImporter, DeLICM and D28518. Indeed, BlockGenerator currently assumes that base pointers are available globally and generates invalid code for new access relation (referring to the base pointer of the original code) if not, even if the base pointer would be available in the statement. This could be fixed with some added complexity and restrictions. The ExprBuilder must lookup the local BBMap and code that call setNewAccessRelation must check whether the base pointer is available first. The code would still be incorrect in the presence of aliasing. There is the switch -polly-ignore-aliasing to explicitly allow this, but it is hardly a justification for the additional complexity. It would still be mostly useless because in most cases either getNextBasePtr() has external linkage in which case the readnone nounwind attributes cannot be derived in the translation unit itself, or is defined in the same translation unit and gets inlined. Reviewed By: grosser Differential Revision: https://reviews.llvm.org/D30695 llvm-svn: 297281	2017-03-08 15:14:46 +00:00
Michael Kruse	5a4ec5c42b	[ScopDetection] Require LoadInst base pointers to be hoisted. Only when load-hoisted we can be sure the base pointer is invariant during the SCoP's execution. Most of the time it would be added to the required hoists for the alias checks anyway, except with -polly-ignore-aliasing, -polly-use-runtime-alias-checks=0 or if AliasAnalysis is already sure it doesn't alias with anything (for instance if there is no other pointer to alias with). Two more parts in Polly assume that this load-hoisting took place: - setNewAccessRelation() which contains an assert which tests this. - BlockGenerator which would use to the base ptr from the original code if not load-hoisted (if the access expression is regenerated) Differential Revision: https://reviews.llvm.org/D30694 llvm-svn: 297195	2017-03-07 20:28:43 +00:00
Tobias Grosser	a0b85963ba	Update isl to isl-0.18-336-g1e193d9 This is a regular maintenance update llvm-svn: 297169	2017-03-07 17:53:34 +00:00
Tobias Grosser	ce69e7b593	[ScopInfo] Avoid infinite loop during schedule construction Our current scop modeling enters an infinite loop when trying to model code that has unreachable instructions (e.g., test/ScopInfo/BoundChecks/single-loop.ll), as the number of basic blocks returned by the LLVM Loop* does not include unreachable basic blocks that branch off from the core loop body. This arises for example in the following piece of code: for (i = 0; i < N; i++) { if (i > 1024) abort(); <- this abort might be translated to an unreachable A[i] = ... } This patch adds these unreachable basic blocks in our per loop basic block count to ensure that the schedule construction does not assume a loop has been processed completely, despite certain unreachable basic blocks still remaining. The infinite loop is only observable in combination with https://reviews.llvm.org/D12676 or a similar patch. llvm-svn: 297156	2017-03-07 16:17:55 +00:00
Tobias Grosser	134a572951	[ScopDetection] Do not detect scops that exit to an unreachable Scops that exit with an unreachable are today still permitted, but make little sense to optimize. We therefore can already skip them during scop detection. This speeds up scop detection in certain cases and also ensures that bugpoint does not introduce unreachables when reducing test cases. In practice this change should have little impact, as the performance of unreachable code is unlikely to matter. This commit is part of a series that makes Polly more robust in the presence of unreachables. llvm-svn: 297151	2017-03-07 15:50:43 +00:00
Tobias Grosser	1c787e0b49	[ScopDetection] Do not allow required-invariant loads in non-affine region These loads cannot be savely hoisted as the condition guarding the non-affine region cannot be duplicated to also protect the hoisted load later on. Today they are dropped in ScopInfo. By checking for this early, we do not even try to model them and possibly can still optimize smaller regions not containing this specific required-invariant load. llvm-svn: 296744	2017-03-02 12:15:37 +00:00
Tobias Grosser	c2f151084d	[ScopInfo] Disable memory folding in case it results in multi-disjunct relations Multi-disjunct access maps can easily result in inbound assumptions which explode in case of many memory accesses and many parameters. This change reduces compilation time of some larger kernel from over 15 minutes to less than 16 seconds. Interesting is the test case test/ScopInfo/multidim_param_in_subscript.ll which has a memory access [n] -> { Stmt_for_body3[i0, i1] -> MemRef_A[i0, -1 + n - i1] } which requires folding, but where only a single disjunct remains. We can still model this test case even when only using limited memory folding. For people only reading commit messages, here the comment that explains what memory folding is: To recover memory accesses with array size parameters in the subscript expression we post-process the delinearization results. We would normally recover from an access A[exp0(i) * N + exp1(i)] into an array A[][N] the 2D access A[exp0(i)][exp1(i)]. However, another valid delinearization is A[exp0(i) - 1][exp1(i) + N] which - depending on the range of exp1(i) - may be preferrable. Specifically, for cases where we know exp1(i) is negative, we want to choose the latter expression. As we commonly do not have any information about the range of exp1(i), we do not choose one of the two options, but instead create a piecewise access function that adds the (-1, N) offsets as soon as exp1(i) becomes negative. For a 2D array such an access function is created by applying the piecewise map: [i,j] -> [i, j] : j >= 0 [i,j] -> [i-1, j+N] : j < 0 After this patch we generate only the first case, except for situations where we can proove the first case to be invalid and can consequently select the second without introducing disjuncts. llvm-svn: 296679	2017-03-01 21:11:27 +00:00
Tobias Grosser	24222c7357	Fix namespaces after clang-format update llvm-svn: 296635	2017-03-01 15:54:27 +00:00
Tobias Grosser	d7c4975349	[ScopInfo] Simplify inbounds assumptions under domain constraints Without this simplification for a loop nest: void foo(long n1_a, long n1_b, long n1_c, long n1_d, long p1_b, long p1_c, long p1_d, float A_1[][p1_b][p1_c][p1_d]) { for (long i = 0; i < n1_a; i++) for (long j = 0; j < n1_b; j++) for (long k = 0; k < n1_c; k++) for (long l = 0; l < n1_d; l++) A_1[i][j][k][l] += i + j + k + l; } the assumption: n1_a <= 0 or (n1_a > 0 and n1_b <= 0) or (n1_a > 0 and n1_b > 0 and n1_c <= 0) or (n1_a > 0 and n1_b > 0 and n1_c > 0 and n1_d <= 0) or (n1_a > 0 and n1_b > 0 and n1_c > 0 and n1_d > 0 and p1_b >= n1_b and p1_c >= n1_c and p1_d >= n1_d) is taken rather than the simpler assumption: p9_b >= n9_b and p9_c >= n9_c and p9_d >= n9_d. The former is less strict, as it allows arbitrary values of p1_* in case, the loop is not executed at all. However, in practice these precise constraints explode when combined across different accesses and loops. For now it seems to make more sense to take less precise, but more scalable constraints by default. In case we find a practical example where more precise constraints are needed, we can think about allowing such precise constraints in specific situations where they help. This change speeds up the new test case from taking very long (waited at least a minute, but it probably takes a lot more) to below a second. llvm-svn: 296456	2017-02-28 09:45:54 +00:00
Tobias Grosser	cf66ea3845	Update isl to isl-0.18-304-g1efe43d This is a normal maintenance update. llvm-svn: 296441	2017-02-28 07:06:06 +00:00
Michael Kruse	6469380daa	[Cmake] Optionally use a system isl version. This patch adds an option to build against a version of libisl already installed on the system. The installation is autodetected using the pkg-config file shipped with isl. The detection of the library is in the FindISL.cmake module that creates an imported target. Contributed-by: Philip Pfaffe <philip.pfaffe@gmail.com> Differential Revision: https://reviews.llvm.org/D30043 llvm-svn: 296361	2017-02-27 17:54:25 +00:00
Michael Kruse	b295c37a15	[DeLICM] Statistics for use in regression tests. Print some measurements of the DeLICM transformation at -analyze to be used in regression tests. llvm-svn: 296347	2017-02-27 15:53:13 +00:00
Roman Gareev	bc3fbe49c5	Disable the parallel code generation in case of extension nodes We can not perform the dependence analysis and, consequently, the parallel code generation in case the schedule tree contains extension nodes. Reviewed-by: Tobias Grosser <tobias@grosser.es> Differential Revision: https://reviews.llvm.org/D30394 llvm-svn: 296325	2017-02-27 08:03:11 +00:00
Michael Kruse	e199f285b0	[DeLICM] Fortify against exceeding isl's max operations counter. Control flow would flow-through after the check whether the operations quota exceeded, with the intention that it would later be caught by Knowledge::isUsable(). However, the Knowledge constructor has its own assertions to check consistency which would fail if its fields have only been initialized partially because some sets have been computed correctly before the operations quota takes effect. Fix by erroring-out early instead of falling-throught into the code that might expect that everything has been computed correctly. For robustness, also bail-out if any of the fields contain nullptr values instead of relying on isl always setting exactly this error code if something went wrong. This should fix the perf-x86_64-penryn-O3-polly-before-vectorizer-unprofitable (-polly-process-unprofitable -polly-position=before-vectorizer -polly-enable-delicm) buildbot. llvm-svn: 296022	2017-02-23 21:58:20 +00:00
Michael Kruse	f4e201e09f	[Support] Remove NonowningIslPtr. NFC. NonowningIslPtr<isl_X> was used as types of function parameters when the function does not consume the isl object, i.e. an __isl_keep parameter. The alternatives are: 1. IslPtr<isl_X> This has additional calls to isl_X_copy and isl_X_free to increase/decrease the reference counter even though not needed. The caller already owns a reference to the isl object. 2. const IslPtr<isl_X>& This does not change the reference counter, but requires an additional load to get the pointer to the isl object (instead of just passing the pointer itself). Moreover, the compiler cannot rely on the constness of the pointer and has to reload the pointer every time it writes to memory (unless alias analysis such as TBAA says it is not possible). The isl C++ bindings currently in development do not have an equivalent to NonowningIslPtr and adding one would make the binding more complicated and its advantage in performance is small. In order to simplify the transition to these C++ bindings, remove NonowningIslPtr. Change every former use of it to alternative 2 mentioned aboce (const IslPtr<isl_X>&). llvm-svn: 295998	2017-02-23 17:57:27 +00:00
Michael Kruse	2c7169d00c	[DependenceInfo] Remove unused variable. NFC. llvm-svn: 295987	2017-02-23 15:41:01 +00:00
Michael Kruse	dd6f29375b	[DependenceInfo] Use references instead of double pointers. NFC. Non-const references are the more C++-ish way to modify a variable passed by the caller. llvm-svn: 295986	2017-02-23 15:40:56 +00:00
Michael Kruse	ec8fc32160	[DependenceInfo] Rename StmtScheduleDomain -> TaggedStmtDomain. NFC. llvm-svn: 295985	2017-02-23 15:40:52 +00:00
Michael Kruse	00c38e0df2	[DependenceInfo] Simplify use of StmtSchedule's domain [NFC] Once a StmtSchedule is created, only its domain is used anywhere within DependenceInfo::calculateDependences. So, we choose to return the wrapped domain of the union_map rather than the entire union_map. However, we still build the union_map first within collectInfo(). It is cleaner to first build the entire union_map and then pull the domain out in one shot, rather than repeatedly extracting the domain in bits and pieces from accdom. Contributed-by: Siddharth Bhat <siddu.druid@gmail.com> Differential Revision: https://reviews.llvm.org/D30208 llvm-svn: 295984	2017-02-23 15:40:46 +00:00
Michael Kruse	52ab4943b4	Remove all references to PostDominators. NFC. Marking a pass as preserved is necessary if any Polly pass uses it, even if it is not preserved within the generated code. Not marking it would cause the the Polly pass chain to be interrupted. It is not used by any Polly pass anymore, hence we can remove all references to it. llvm-svn: 295983	2017-02-23 15:16:22 +00:00
Michael Kruse	9f519714b3	[DeLICM] Add missing Doxygen comment. NFC. llvm-svn: 295978	2017-02-23 14:51:50 +00:00
Michael Kruse	311ecb00dc	[DeLICM] Capitalize parameter name. NFC. llvm-svn: 295977	2017-02-23 14:51:45 +00:00
Tobias Grosser	59d23bbdc6	Update isl to isl-0.18-282-g12465a5 Besides a variety of smaller cleanups, this update also contains a correctness fix to isl coalesce which resolves a crash in Polly. llvm-svn: 295966	2017-02-23 12:48:42 +00:00
Roman Gareev	96e1119a96	Make optimizations based on pattern matching be enabled by default Currently, pattern based optimizations of Polly can identify matrix multiplication and optimize it according to BLIS matmul optimization pattern (see ScheduleTreeOptimizer for details). This patch makes optimizations based on pattern matching be enabled by default. Reviewed-by: Tobias Grosser <tobias@grosser.es> Differential Revision: https://reviews.llvm.org/D30293 llvm-svn: 295958	2017-02-23 11:44:12 +00:00
Michael Kruse	d8d32bb3d1	[DeLICM] Regression test for skipping map targets. Add optimization-remarks-missed for when mapping targets have been skipped and add regression tests for them. llvm-svn: 295953	2017-02-23 10:25:20 +00:00
Michael Kruse	deb30e8278	[DeLICM] Add regression tests for DeLICM reject cases. These tests were not included in the main DeLICM commit. These check the cases where zone analysis cannot be successful because of assumption violations. We use the LLVM optimization remark infrastructure as it seems to be the best fit for this kind of messages. I tried to make use if the OptimizationRemarkEmitter. However, it would insert additional function passes into the pass manager to get the hotness information. The pass manager would insert them between the flatten pass and delicm, causing the ScopInfo with the flattened schedule being thrown away. Differential Revision: https://reviews.llvm.org/D30253 llvm-svn: 295846	2017-02-22 15:14:08 +00:00
Michael Kruse	8474470500	[DeLICM] Fix wrong comment. NFC. Correct a comment that claimed that a store after load was detected when the code checks a load after a store. llvm-svn: 295835	2017-02-22 14:14:40 +00:00
Michael Kruse	43ed25f1d9	[DeLICM] Print message when zone analysis is not available on -analysis. This is to distinguish the cases that analysis has failed from the case where not transformation was performed. llvm-svn: 295833	2017-02-22 13:48:35 +00:00
Michael Kruse	91cdafb86f	[DeLICM] Use opt<int>. There is no template specialization for cl::parser<unsigned long> such that parsing an cl::opt<unsigned long> command line argument will fail. Use opt<int> instead which has an associated parser. llvm-svn: 295832	2017-02-22 13:48:18 +00:00
Tobias Grosser	cc43087afc	[DependenceInfo] Simplify creation and subsequent use of AccessSchedule [NFC] We only ever use the wrapped domain of AccessSchedule, so stop creating an entire union_map and then pulling the domain out. Reviewers: grosser Tags: #polly Contributed-by: Siddharth Bhat <siddu.druid@gmail.com> Differential Revision: https://reviews.llvm.org/D30179 llvm-svn: 295726	2017-02-21 15:38:31 +00:00
Michael Kruse	9e52c39f0a	[DeLICM] Map values hoisted by LICM back to the array. Implement the -polly-delicm pass. The pass intends to undo the effects of LoopInvariantCodeMotion (LICM) which adds additional scalar dependencies into SCoPs. DeLICM will try to map those scalars back to the array elements they were promoted from, as long as the array element is unused. The is the main patch from the DeLICM/DePRE patch series. It does not yet undo GVN PRE for which additional information about known values is needed and does not handle PHI write accesses that have have no target. As such its usefulness is limited. Patches for these issues including regression tests for error situatons will follow. Reviewers: grosser Differential Revision: https://reviews.llvm.org/D24716 llvm-svn: 295713	2017-02-21 10:20:54 +00:00
Michael Kruse	5ab24fdb73	[Cmake] Install the isl headers into the install tree. isl headers are currently missing in a Polly installation. Because the Polly headers depend on those, code can't be compiled against an installed Polly. This patch installs the isl headers. I left a TODO, as optionally it should be possible to use a system version of isl instead of the one shipped with Polly. When compiling, clients of the installation need to add -I${PREFIX}/include/polly/ to there include path right now, because there currently is no way to export this path automatically. Contributed-by: Philip Pfaffe <philip.pfaffe@gmail.com> Differential Revision: https://reviews.llvm.org/D29931 llvm-svn: 295671	2017-02-20 16:57:14 +00:00
Tobias Grosser	079d511891	[ScopInfo] Count read-only arrays when computing complexity of alias check Instead of counting the number of read-only accesses, we now count the number of distinct read-only array references when checking if a run-time alias check may be too complex. The run-time alias check is quadratic in the number of base pointers, not the number of accesses. Before this change we accidentally skipped SPEC's lbm test case. llvm-svn: 295567	2017-02-18 20:51:29 +00:00
Tobias Grosser	28492b85e2	[DependenceInfo] Pull out statement [NFC] This simplifies the code slightly. llvm-svn: 295551	2017-02-18 16:41:28 +00:00
Tobias Grosser	8ee46985d2	[Dependences] Compute reduction dependences on schedule tree [NFC] This change gets rid of the need for zero padding, makes the reduction computation code more similar to the normal dependence computation, and also better documents what we do at the moment. Making the dependence computation for reductions a little bit easier to understand will hopefully help us to further reduce code duplication. This reduces the time spent only in the reduction dependence pass from 260ms to 150ms for test/DependenceInfo/reduction_sequence.ll. This is a reduction of over 40% in dependence computation time. This change was inspired by discussions with Michael Kruse, Utpal Bora, Siddharth Bhat, and Johannes Doerfert. It can hopefully lay the base for further cleanups of the reduction code. llvm-svn: 295550	2017-02-18 16:39:04 +00:00

... 6 7 8 9 10 ...

2832 Commits