llvm-project

Commit Graph

Author	SHA1	Message	Date
Michael Kruse	0e370cf1a7	Check whether IslAstInfo and DependenceInfo were computed for the same Scop. Since -polly-codegen reports itself to preserve DependenceInfo and IslAstInfo, we might get those analysis that were computed by a different ScopInfo for a different Scop structure. This would be unfortunate because DependenceInfo and IslAstInfo hold references to resources allocated by ScopInfo/ScopBuilder/Scop (e.g. isl_id). If -polly-codegen and DependenceInfo/IslAstInfo do not agree on which Scop to use, unpredictable things can happen. When the ScopInfo/Scop object is freed, there is a high probability that the new ScopInfo/Scop object will be created at the same heap position with the same address. Comparing whether the Scop or ScopInfo address is the expected therefore is unreliable. Instead, we compare the address of the isl_ctx object. Both, DependenceInfo and IslAstInfo must hold a reference to the isl_ctx object to ensure it is not freed before the destruction of those analyses which might happen after the destruction of the Scop/ScopInfo they refer to. Hence, the isl_ctx will not be freed and its address not reused as long there is a DependenceInfo or IslAstInfo around. This fixes llvm.org/PR34441 llvm-svn: 313842	2017-09-21 00:01:13 +00:00
Michael Kruse	8dceb76066	[ScheduleOptimizer] Fix and test schedule tree statistics. Fix walking over the schedule tree to collect its properties (Number of permutable bands etc.). Also add regression tests for these statistics. llvm-svn: 313750	2017-09-20 11:53:05 +00:00
Michael Kruse	89972e21f8	[ForwardOpTree] Allow out-of-quota in examination part of forwardTree. Computing the reaching definition in forwardTree() can take a long time if the coefficients are large. When the forwarding is carried-out (doIt==true), forwardTree() must execute entirely or not at all to get a consistent output, which means we cannot just allow out-of-quota errors to happen in the middle of the processing. We introduce the class IslQuotaScope which allows to opt-in code that is conformant and has been tested with out-of-quota events. In case of ForwardOpTree, out-of-quota is allowed during the operand tree examination, but not during the transformation. The same forwardTree() recursion is used for examination and execution, meaning that the reaching definition has already been computed in the examination tree walk and cached for reuse in the transformation tree walk. This should fix the time-out of grtestutils.ll of the asop buildbot. If the compilation still takes too long, we can reduce the max-operations allows for -polly-optree. Differential Revision: https://reviews.llvm.org/D37984 llvm-svn: 313690	2017-09-19 22:53:20 +00:00
Philipp Schaad	cf0a22f786	[GPUJIT] Improved temporary file handling. Summary: Imporved the way the GPUJIT handles temporary files for Intel's Beignet. Reviewers: bollu, grosser Reviewed By: grosser Subscribers: philip.pfaffe, pollydev Differential Revision: https://reviews.llvm.org/D37691 llvm-svn: 313623	2017-09-19 10:41:29 +00:00
Michael Kruse	ef8325ba50	[ForwardOpTree] Test the max operations quota. cl::opt<unsigned long> is not specialized and hence the option -polly-optree-max-ops impossible to use. Replace by supported option cl::opt<unsigned>. Also check for an error state when computing the written value, which happens when the quota runs out. llvm-svn: 313546	2017-09-18 17:43:50 +00:00
Michael Kruse	eac3eebfea	[test] Enable -polly-codegen-verify for regression tests. In r301670 IR verification was disabled. Since then, CodeGen writing malformed IR would only be noticed by unpredictable behavior in follow-up passes (e.g. segfaults, infinite loops) or IR verification in the backend assert builds. Re-enable -polly-codegen-verify at for the regression tests to ensure that malformed IR is detected where Polly generated malformed IR in the past and changes in CodeGen are at least partially covered by check-polly (otherwise malformed IR may only get noticed when the buildbots run the test-suite). Differential Revision: https://reviews.llvm.org/D37969 llvm-svn: 313527	2017-09-18 12:34:11 +00:00
Michael Kruse	ad32de9424	[ForwardOptTree] Remove redundant simplify(). NFC. The result of computeKnown has already been simplified. llvm-svn: 313526	2017-09-18 12:28:07 +00:00
Zachary Turner	ce92db13ea	Resubmit "[lit] Force site configs to run before source-tree configs" This is a resubmission of r313270. It broke standalone builds of compiler-rt because we were not correctly generating the llvm-lit script in the standalone build directory. The fixes incorporated here attempt to find llvm/utils/llvm-lit from the source tree returned by llvm-config. If present, it will generate llvm-lit into the output directory. Regardless, the user can specify -DLLVM_EXTERNAL_LIT to point to a specific lit.py on their file system. This supports the use case of someone installing lit via a package manager. If it cannot find a source tree, and -DLLVM_EXTERNAL_LIT is either unspecified or invalid, then we print a warning that tests will not be able to run. Differential Revision: https://reviews.llvm.org/D37756 llvm-svn: 313407	2017-09-15 22:10:46 +00:00
Zachary Turner	83dcb68468	Revert "[lit] Force site configs to run before source-tree configs" This patch is still breaking several multi-stage compiler-rt bots. I already know what the fix is, but I want to get the bots green for now and then try re-applying in the morning. llvm-svn: 313335	2017-09-15 02:56:40 +00:00
Zachary Turner	a0e55b6403	[lit] Force site configs to be run before source-tree configs This patch simplifies LLVM's lit infrastructure by enforcing an ordering that a site config is always run before a source-tree config. A significant amount of the complexity from lit config files arises from the fact that inside of a source-tree config file, we don't yet know if the site config has been run. However it is always required to run a site config first, because it passes various variables down through CMake that the main config depends on. As a result, every config file has to do a bunch of magic to try to reverse-engineer the location of the site config file if they detect (heuristically) that the site config file has not yet been run. This patch solves the problem by emitting a mapping from source tree config file to binary tree site config file in llvm-lit.py. Then, during discovery when we find a config file, we check to see if we have a target mapping for it, and if so we use that instead. This mechanism is generic enough that it does not affect external users of lit. They will just not have a config mapping defined, and everything will work as normal. On the other hand, for us it allows us to make many simplifications: * We are guaranteed that a site config will be executed first * Inside of a main config, we no longer have to assume that attributes might not be present and use getattr everywhere. * We no longer have to pass parameters such as --param llvm_site_config=<path> on the command line. * It is future-proof, meaning you don't have to edit llvm-lit.in to add support for new projects. * All of the duplicated logic of trying various fallback mechanisms of finding a site config from the main config are now gone. One potentially noteworthy thing that was required to implement this change is that whereas the ninja check targets previously used the first method to spawn lit, they now use the second. In particular, you can no longer run lit.py against the source tree while specifying the various `foo_site_config=<path>` parameters. Instead, you need to run llvm-lit.py. Differential Revision: https://reviews.llvm.org/D37756 llvm-svn: 313270	2017-09-14 16:47:58 +00:00
Roman Gareev	925ce50f1b	Unroll and separate the remaining parts of isolation The remaining parts produced by the full partial tile isolation can contain hot spots that are worth to be optimized. Currently, we rely on the simple loop unrolling pass, LiCM and the SLP vectorizer to optimize such parts. However, the approach can suffer from the lack of the information about aliasing that Polly provides using additional alias metadata or/and the lack of the information required by simple loop unrolling pass. This patch is the first step to optimize the remaining parts. To do it, we unroll and separate them. In case of, for instance, Intel Kaby Lake, it helps to increase the performance of the generated code from 39.87 GFlop/s to 49.23 GFlop/s. The next possible step is to avoid unrolling performed by Polly in case of isolated and remaining parts and rely only on simple loop unrolling pass and the Loop vectorizer. Reviewed-by: Tobias Grosser <tobias@grosser.es> Differential Revision: https://reviews.llvm.org/D37692 llvm-svn: 312929	2017-09-11 17:46:47 +00:00
Michael Kruse	0481d78c6c	[CodegenCleanup] Update cleanup passes according (old) PassManagerBuilder. Update CodegenCleanup using the function-level passes added by populatePassManager that run between EP_EarlyAsPossible and EP_VectorizerStart in -O3. The changes in particular are: - Added pass create arguments, e.g. ExpensiveCombines for InstCombine. - Remove reroll pass. The option -reroll-loops is disabled by default. - Add passes run with UnitAtATime, which is the default. - Add instances of LibCallsShrinkWrap, TailCallElimination, SCCP (sparse conditional constant propagation), Float2Int that did not run before. - Add instances of GVN as in the default pipeline. Notes: - GVNHoist, GVNSink, NewGVN are still disabled in the -O3 pipeline. - The optimization level and other optimization parameters are not accessible outside of PassManagerBuilder, hence we cannot add passes depending on these. Differential Revision: https://reviews.llvm.org/D37571 llvm-svn: 312875	2017-09-09 21:43:49 +00:00
Reid Kleckner	b79e7a6897	Fix some unused warnings in polly llvm-svn: 312755	2017-09-07 22:46:51 +00:00
Michael Kruse	2f5cbc449a	[CodeGen] Bitcast scalar writes to actual value. The type of NewValue might change due to ScalarEvolution looking though bitcasts. The synthesized NewValue therefore becomes the type before the bitcast. llvm-svn: 312718	2017-09-07 12:15:01 +00:00
Siddharth Bhat	e2950f46c6	[PPCGCodeGen] Document pre-composition with Zero in getExtent. [NFC] It's weird at first glance that we do this, so I wrote up some documentation on why we need to perform this process. llvm-svn: 312715	2017-09-07 11:57:33 +00:00
Michael Kruse	8ee179d3b4	Revert "[ScopDetect/Info] Look through PHIs that follow an error block" This reverts commit r312410 - [ScopDetect/Info] Look through PHIs that follow an error block The commit caused generation of invalid IR due to accessing a parameter that does not dominate the SCoP. llvm-svn: 312663	2017-09-06 19:05:40 +00:00
Michael Kruse	48c726f925	[test] Add forgotten REQUIRES: line. llvm-svn: 312632	2017-09-06 13:11:24 +00:00
Michael Kruse	bd84ce8931	[ZoneAlgo] Handle non-StoreInst/LoadInst MemoryAccesses including memset. Up to now ZoneAlgo considered array elements access by something else than a LoadInst or StoreInst as not analyzable. This patch removes that restriction by using the unknown ValInst to describe the written content, repectively the element type's null value in case of memset. Differential Revision: https://reviews.llvm.org/D37362 llvm-svn: 312630	2017-09-06 12:40:55 +00:00
Michael Kruse	420c4863a9	[Simplify] Actually remove unsed instruction from region header. Since r312249 instructions of a entry block of region statements are not marked as root anymore and hence can theoretically be removed if unused. Theoretically, because the instruction list was not changed. Still, MemoryAccesses for unused instructions were removed. This lead to a failed assertion in the code generator when the MemoryAccess for the still listed instruction was not found. This hould fix the Assertion failed: ArrayAccess && "No array access found for instruction!", file ScopInfo.h, line 1494 compiler crashes. llvm-svn: 312566	2017-09-05 19:44:39 +00:00
Tobias Grosser	1a695b1d6c	[CodegenCleanup] Use old GVN pass instead of NewGVN It seems NewGVN still has some problems: llvm.org/PR34452, we will switch back after they have been resolved. llvm-svn: 312480	2017-09-04 11:04:33 +00:00
Tobias Grosser	8703e38380	[ISLTools]: Move singleton to isl++ llvm-svn: 312476	2017-09-04 10:05:29 +00:00
Tobias Grosser	3575afd739	[DeLICM] Move some functions to isl++ [NFC] llvm-svn: 312475	2017-09-04 10:05:25 +00:00
Tobias Grosser	d6e0679c4e	[ForwardOp] Remove read accesses for all instructions that have been moved Before this patch, OpTree did not consider forwarding an operand tree consisting of only single LoadInst as useful. The motivation was that, like an access to a read-only variable, it would just replace one MemoryAccess by another. However, in contrast to read-only accesses, this would replace a scalar access by an array access, which is something worth doing. In addition, leaving scalar MemoryAccess is problematic in that VirtualUse prioritizes inter-Stmt use over intra-Stmt. It was possible that the same LLVM value has a MemoryAccess for accessing the remote Stmt's LoadInst as well as having the same LoadInst in its own instruction list (due to being forwarded from another operand tree). With this patch we ensure that if a LoadInst is forwarded is any operand tree, also the operand tree containing just the LoadInst is forwarded as well, which effectively removes the scalar MemoryAccess such that only the array access remains, not both. Thanks Michael for the detailed explanation. Reviewers: Meinersbur, bellu, singam-sanjay, gareevroman Subscribers: hfinkel, pollydev, llvm-commits Tags: #polly Differential Revision: https://reviews.llvm.org/D37424 llvm-svn: 312456	2017-09-03 19:52:15 +00:00
Tobias Grosser	701d943d12	[IslAst] Do not assert in case of empty min/max alias locations In certain situations, the context in the isl_ast_build could result for the min/max locations of our alias sets to become empty, which would cause an internal error in isl, which is then unable to derive a value for these expressions. Check these conditions before code generating expressions and instead assume that alias check succeeded. This is valid, as the corresponding memory accesses will not be executed under any valid context. This fixed llvm.org/PR34432. Thanks to Qirun Zhang for reporting. llvm-svn: 312455	2017-09-03 19:47:19 +00:00
Tobias Grosser	6b1e461329	[IslAst] Move buildCondition to isl++ llvm-svn: 312452	2017-09-03 18:31:44 +00:00
Tobias Grosser	99ccf05694	[ScopHelper] Do not crash on unreachable blocks This resolves llvm.org/PR34433. Thanks to Zhendong Su for reporting. llvm-svn: 312451	2017-09-03 18:01:22 +00:00
Michael Kruse	7954a221f3	[ForwardOpTree] Fix typos. NFC. llvm-svn: 312446	2017-09-03 16:09:38 +00:00
Tobias Grosser	4baedc70d1	[ScopDetect/Info] Look through PHIs that follow an error block In case a PHI node follows an error block we can assume that the incoming value can only come from the node that is not an error block. As a result, conditions that seemed non-affine before are now in fact affine. llvm-svn: 312410	2017-09-02 08:25:55 +00:00
Siddharth Bhat	3928e3f50a	[ISLNodeBuilder] Materialize Fortran array sizes of arrays without memory accesses. In Polly, we specifically add a paramter to represent the outermost dimension size of fortran arrays. We do this because this information is statically available from the fortran metadata generated by dragonegg. However, we were only materializing these parameters (meaning, creating an llvm::Value to back the isl_id) from memory accesses. This is wrong, we should materialize parameters from scop array info. It is wrong because if there is a case where we detect 2 fortran arrays, but only one of them is accessed, we may not materialize the other array's dimensions at all. This is incorrect. We fix this by looping over all `polly::ScopArrayInfo` in a scop, rather that just all `polly::MemoryAccess`. Differential Revision: https://reviews.llvm.org/D37379 llvm-svn: 312350	2017-09-01 18:55:43 +00:00
Michael Kruse	0c6c555beb	Fix Memory Access of failing tests. Mark scalar dependences for different statements belonging to same BB as 'Inter'. Contributed-by: Nandini Singhal <cs15mtech01004@iith.ac.in> Differential Revision: https://reviews.llvm.org/D37147 llvm-svn: 312324	2017-09-01 11:36:52 +00:00
Roman Gareev	1cb3491620	Run GVN during the cleanup Currently, GVN can be necessary to eliminate redundant instructions in case of, for instance, GEMM and float type. This patch makes GVN be run during the cleanup. Reviewed-by: Tobias Grosser <tobias@grosser.es>, Michael Kruse <llvm@meinersbur.de> Differential Revision: https://reviews.llvm.org/D37340 llvm-svn: 312307	2017-09-01 06:52:28 +00:00
Tobias Grosser	04567fd480	Drop unused statistic counter llvm-svn: 312304	2017-09-01 02:17:10 +00:00
Mandeep Singh Grang	c2774a549b	[polly] Fix non-deterministic output due to iteration of unordered ScopArrayInfo Summary: This fixes the following failures in the reverse iteration builder: http://lab.llvm.org:8011/builders/reverse-iteration/builds/25 Polly :: MaximalStaticExpansion/working_deps_between_inners.ll Polly :: MaximalStaticExpansion/working_expansion_multiple_dependences_per_statement.ll Polly :: MaximalStaticExpansion/working_expansion_multiple_instruction_per_statement.ll Polly :: MaximalStaticExpansion/working_phi_expansion.ll Reviewers: simbuerg, Eugene.Zelenko, grosser, zinob, bollu Reviewed By: grosser Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D37349 llvm-svn: 312273	2017-08-31 20:10:30 +00:00
Roman Gareev	6589748920	Use the information about the target cache provided by the TargetTransformInfo. Reviewed-by: Tobias Grosser <tobias@grosser.es> Differential Revision: https://reviews.llvm.org/D37178 llvm-svn: 312255	2017-08-31 17:07:54 +00:00
Tobias Grosser	2307f86c47	[ForwardOpTree] Allow forwarding in the presence of region statements Summary: After region statements now also have instruction lists, this is a straightforward extension. Reviewers: Meinersbur, bollu, singam-sanjay, gareevroman Reviewed By: Meinersbur Subscribers: hfinkel, pollydev, llvm-commits Tags: #polly Differential Revision: https://reviews.llvm.org/D37298 llvm-svn: 312249	2017-08-31 16:04:49 +00:00
Siddharth Bhat	56572c6a5e	[PPCGCodeGen] Convert intrinsics to libdevice functions whenever possible. This is useful when we face certain intrinsics such as `llvm.exp.*` which cannot be lowered by the NVPTX backend while other intrinsics can. So, we would need to keep blacklists of intrinsics that cannot be handled by the NVPTX backend. It is much simpler to try and promote all intrinsics to libdevice versions. This patch makes function/intrinsic very uniform, and will always try to use a libdevice version if it exists. Differential Revision: https://reviews.llvm.org/D37056 llvm-svn: 312239	2017-08-31 13:03:37 +00:00
Tobias Grosser	c43d0360cc	[BlockGenerator] Generate entry block of regions from instruction lists The adds code generation support for the previous commit. This patch has been re-applied, after the memory issue in the previous patch has been fixed. llvm-svn: 312211	2017-08-31 03:17:35 +00:00
Tobias Grosser	bd15d13d4e	[ScopInfo] Use statement lists for entry blocks of region statements By using statement lists in the entry blocks of region statements, instruction level analyses also work on region statements. We currently only model the entry block of a region statements, as this is sufficient for most transformations the known-passes currently execute. Modeling instructions in the presence of control flow (e.g. infinite loops) is left out to not increase code complexity too much. It can be added when good use cases are found. This change set is reapplied, after a memory corruption issue had been fixed. llvm-svn: 312210	2017-08-31 03:15:56 +00:00
Tobias Grosser	d3edc16416	Revert "[ScopInfo] Use statement lists for entry blocks of region statements" This reverts commit r312128. It aused some memory issues. llvm-svn: 312209	2017-08-31 02:43:49 +00:00
Tobias Grosser	6f1f5cbb5b	Revert "[BlockGenerator] Generate entry block of regions from instruction lists" This reverts commit r312129. It caused some memory issues. llvm-svn: 312208	2017-08-31 02:43:27 +00:00
Adrian Prantl	6120801066	Adapt testcase to LLVM change in DIGlobalVariableExpression. llvm-svn: 312147	2017-08-30 18:12:35 +00:00
Tobias Grosser	1e34508bcc	[BlockGenerator] Generate entry block of regions from instruction lists The adds code generation support for the previous commit. llvm-svn: 312129	2017-08-30 15:08:30 +00:00
Tobias Grosser	6fbe4c8501	[ScopInfo] Use statement lists for entry blocks of region statements By using statement lists in the entry blocks of region statements, instruction level analyses also work on region statements. We currently only model the entry block of a region statements, as this is sufficient for most transformations the known-passes currently execute. Modeling instructions in the presence of control flow (e.g. infinite loops) is left out to not increase code complexity too much. It can be added when good use cases are found. llvm-svn: 312128	2017-08-30 15:08:21 +00:00
Michael Kruse	f3387836d0	[ScopBuilder/ScopInfo] Move reduction detection to ScopBuilder. NFC. Reduction detection is only executed in the SCoP building phase. Hence it fits better into ScopBuilder to separate SCoP-construction from SCoP modeling. llvm-svn: 312118	2017-08-30 13:05:08 +00:00
Michael Kruse	35aa9d862e	[ScopBuilder/ScopInfo] Move ScopStmt::collectSurroundingLoops to ScopBuilder. NFC. This method is only called in the SCoP building phase. Therefore it fits better into ScopBuilder to separate SCoP-construction from SCoP modeling. llvm-svn: 312117	2017-08-30 13:05:01 +00:00
Michael Kruse	eb83141f9e	[ScopBuilder/ScopInfo] Move ScopStmt::buildDomain to ScopBuilder. NFC. This method is only called in the SCoP building phase. Therefore it fits better into ScopBuilder to separate SCoP-construction from SCoP modeling. llvm-svn: 312116	2017-08-30 13:04:54 +00:00
Michael Kruse	a29f8c03d4	[ScopBuilder/ScopInfo] Move ScopStmt::buildAccessRelations to ScopBuilder. NFC. This method is only called in the SCoP building phase. Therefore it fits better into ScopBuilder to separate SCoP-construction from SCoP modeling. This mostly mechanical change makes ScopBuilder directly access some of ScopStmt/MemoryAccess private fields. We add ScopBuilder as a friend class and will add proper accessor functions sometime later. llvm-svn: 312115	2017-08-30 13:04:46 +00:00
Michael Kruse	f6eb3a2ed2	[ScopBuilder/ScopInfo] Move and inline Scop::init into ScopBuilder::buildScop. NFC. The method is only needed in the SCoP building phase, and doesn't need to be part of the general API. llvm-svn: 312114	2017-08-30 13:04:39 +00:00
Michael Kruse	860870b7b0	[ScopBuilder] Report to dbgs() on SCoP bailout. NFC. This allows to use -debug to see that a SCoP was found in ScopDetect, but dismissed by ScopBuilder. llvm-svn: 312113	2017-08-30 11:52:03 +00:00
Michael Kruse	591255183b	[ScopBuilder] Introduce metadata for splitting scop statement. This patch allows annotating of metadata in ir instruction (with "polly_split_after"), which specifies where to split a particular scop statement. Contributed-by: Nandini Singhal <cs15mtech01004@iith.ac.in> Differential Revision: https://reviews.llvm.org/D36402 llvm-svn: 312107	2017-08-30 10:11:06 +00:00
Michael Kruse	99cc9ded41	Do not consider mem intrinsics as error. The intrinsics memset, memcopy and memmove do have their memory accesses modeled by ScopBuilder. Do not consider them error-case behavior. Test case will come with a future patch that requires memory intrinsics outside of error blocks. llvm-svn: 312021	2017-08-29 18:27:47 +00:00
Michael Kruse	25d3f85a43	Skip ignored intrinsics. Commit r252725 introduced a "return false" if an ignored intrinsics was found. The consequence of this was that the mere existence of an ignored intrinsic (such as llvm.dbg.value) before a call that would have qualified the block to be an error block, to not be an error block. The obvious goal was to just skip ignored intrinsics, not changing the meaning of what an error block is. llvm-svn: 312020	2017-08-29 18:27:42 +00:00
Siddharth Bhat	7de7abb09c	[ScopInfo] Fix comment grammar. "..to be build" -> "..to be built". [NFC] llvm-svn: 311995	2017-08-29 11:46:14 +00:00
Michael Kruse	4728184342	[ZoneAlgo] More fine-grained bail-out. ZoneAlgo used to bail out for the complete SCoP if it encountered something violating its assumption. This meant the neither OpTree can forward any load nor DeLICM do anything in such cases, even if their transformations are unrelated to the violations. This patch adds a list of compatible elements (currently with the granularity of entire arrays) that can be used for analysis. OpTree and DeLICM can then check whether their transformations only concern compatible elements, and skip non-compatible ones. This will be useful for e.g. Polybench's benchmarks covariance, correlation, bicg, doitgen, durbin, gramschmidt, adi that have assumption violation, but which are not necessarily relevant for all transformations. Differential Revision: https://reviews.llvm.org/D37219 llvm-svn: 311929	2017-08-28 20:39:07 +00:00
Tobias Grosser	ee8ad1c0ff	[IslAst] Do not compare arrays in alias check which are known to be identical This possibly helps to avoid run-time check failures in the COSMO kernels. llvm-svn: 311920	2017-08-28 20:17:02 +00:00
Michael Kruse	a4f447c2a4	[PM] Properly require and preserve OptimizationRemarkEmitter. NFCI. Properly require and preserve the OptimizationRemarkEmitter for use in ScopPass. Previously one had to get the ORE from ScopDetection because CodeGeneration did not mark it as preserved. It would need to be recomputed which results in the legacy PM to throw away all previous SCoP analysis. This also changes the implementation of ScopPass::getAnalysisUsage to not unconditionally preserve all passes, but only those needed to be preserved by any SCoP pass (at least when using the legacy PM). This allows invalidating DependenceInfo (and IslAstInfo) in case the pass would cause them to change (e.g. OpTree, DeLICM, MaximalArrayExpansion) JSONImporter should also invalidate the DependenceInfo. In this patch it marks DependenceInfo as preserved anyway because some regression tests depend on it. Differential Revision: https://reviews.llvm.org/D37010 llvm-svn: 311888	2017-08-28 14:07:33 +00:00
Michael Kruse	e983e6b1c5	[ZoneAlgo] Print rejection reasons to llvm::dbgs(). NFC. llvm-svn: 311885	2017-08-28 11:22:23 +00:00
Tobias Grosser	93ab558d2e	[Detect] Consider nested loop profitable if entry block is not in loop In cases where the entry block of a scop was not contained in a loop that was part of the scop region and at the same time there was a loop surrounding the scop, we missed to count the loops in the scop and consequently did not consider the scop profitable. We correct this by only moving to the loop parent, in case the current loop is loop contained in the scop. This increases the number of loops in COSMO which we assume to be profitable from 3974 to 4981. llvm-svn: 311863	2017-08-27 21:39:25 +00:00
Philipp Schaad	8cb2e3245c	[Polly][GPGPU] Fixed undefined reference for CUDA's managed memory in Runtime library. llvm-svn: 311848	2017-08-27 12:50:51 +00:00
Eugene Zelenko	a32707d5b1	[Polly] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC). llvm-svn: 311802	2017-08-25 21:35:27 +00:00
Eugene Zelenko	9248fde53a	[Polly] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC). llvm-svn: 311704	2017-08-24 21:22:41 +00:00
Tobias Grosser	6d0970f64e	Revert "[polly] Fix ScopDetectionDiagnostic test failure caused by r310940" This reverts commit 950849ece9bb8fdd2b41e3ec348b9653b4e37df6. This commit broke various buildbots. llvm-svn: 311692	2017-08-24 19:47:15 +00:00
Michael Kruse	b795bfc0d4	[CodeGen] Detect impossible partial write conditions more reliably. Whether a partial write is tautological/unsatisfiable not only depends on the access domain, but also on the domain covered by its node in the AST. In the example below, there are two instances of Stmt_cond_false. It may have a partial write access that is not executed in instance Stmt_cond_false(0). for (int c0 = 0; c0 < tmp5; c0 += 1) { Stmt_for_body344(c0); if (tmp5 >= c0 + 2) Stmt_cond_false(c0); Stmt_cond_end(c0); } if (tmp5 <= 0) { Stmt_for_body344(0); Stmt_cond_false(0); Stmt_cond_end(0); } Isl cannot derive a subscript for an array element that is never accessed. This caused an error in that no subscript expression has been generated in IslNodeBuilder::createNewAccesses, but BlockGenerator expected one to exist because there is an execution of that write, just not in that ast node. Fixed by instead of determining whether the access domain is empty, inspect whether isl generated a constant "false" ast expression in the current ast node. This should fix a compiler crash of the aosp buildbot. llvm-svn: 311663	2017-08-24 14:51:35 +00:00
Siddharth Bhat	78027437e6	[Polly] [PPCGCodeGeneration] Mild refactoring of checking validity of functions in a kernel. This is a stylistic change to make the function a little more readable. Also add a debug print to show what instruction contains a use of a function we don't understand in the kernel. Differential Revision: https://reviews.llvm.org/D37058 llvm-svn: 311648	2017-08-24 09:54:15 +00:00
Andreas Simbuerger	e478e2de83	[Polly][WIP] Scalar fully indexed expansion Summary: This patch comes directly after https://reviews.llvm.org/D34982 which allows fully indexed expansion of MemoryKind::Array. This patch allows expansion for MemoryKind::Value and MemoryKind::PHI. MemoryKind::Value seems to be working with no majors modifications of D34982. A test case has been added. Unfortunatly, no "run time" checks can be done for now because as @Meinersbur explains in a comment on D34982, DependenceInfo need to be cleared and reset to take expansion into account in the remaining part of the Polly pipeline. There is no way to do that in Polly for now. MemoryKind::PHI is not working. Test case is in place, but not working. To expand MemoryKind::Array, we expand first the write and then after the reads. For MemoryKind::PHI, the idea of the current implementation is to exchange the "roles" of the read and write and expand first the read according to its domain and after the writes. But with this strategy, I still encounter the problem of union_map in new access map. For example with the following source code (source code of the test case) : ``` void mse(double A[Ni], double B[Nj]) { int i,j; double tmp = 6; for (i = 0; i < Ni; i++) { for (int j = 0; j<Nj; j++) { tmp = tmp + 2; } B[i] = tmp; } } ``` Polly gives us the following statements and memory accesses : ``` Statements { Stmt_for_body Domain := { Stmt_for_body[i0] : 0 <= i0 <= 9999 }; Schedule := { Stmt_for_body[i0] -> [i0, 0, 0] }; ReadAccess := [Reduction Type: NONE] [Scalar: 1] { Stmt_for_body[i0] -> MemRef_tmp_04__phi[] }; MustWriteAccess := [Reduction Type: NONE] [Scalar: 1] { Stmt_for_body[i0] -> MemRef_tmp_11__phi[] }; Instructions { %tmp.04 = phi double [ 6.000000e+00, %entry.split ], [ %add.lcssa, %for.end ] } Stmt_for_inc Domain := { Stmt_for_inc[i0, i1] : 0 <= i0 <= 9999 and 0 <= i1 <= 9999 }; Schedule := { Stmt_for_inc[i0, i1] -> [i0, 1, i1] }; MustWriteAccess := [Reduction Type: NONE] [Scalar: 1] { Stmt_for_inc[i0, i1] -> MemRef_tmp_11__phi[] }; ReadAccess := [Reduction Type: NONE] [Scalar: 1] { Stmt_for_inc[i0, i1] -> MemRef_tmp_11__phi[] }; MustWriteAccess := [Reduction Type: NONE] [Scalar: 1] { Stmt_for_inc[i0, i1] -> MemRef_add_lcssa__phi[] }; Instructions { %tmp.11 = phi double [ %tmp.04, %for.body ], [ %add, %for.inc ] %add = fadd double %tmp.11, 2.000000e+00 %exitcond = icmp ne i32 %inc, 10000 } Stmt_for_end Domain := { Stmt_for_end[i0] : 0 <= i0 <= 9999 }; Schedule := { Stmt_for_end[i0] -> [i0, 2, 0] }; MustWriteAccess := [Reduction Type: NONE] [Scalar: 1] { Stmt_for_end[i0] -> MemRef_tmp_04__phi[] }; ReadAccess := [Reduction Type: NONE] [Scalar: 1] { Stmt_for_end[i0] -> MemRef_add_lcssa__phi[] }; MustWriteAccess := [Reduction Type: NONE] [Scalar: 0] { Stmt_for_end[i0] -> MemRef_B[i0] }; Instructions { %add.lcssa = phi double [ %add, %for.inc ] store double %add.lcssa, double* %arrayidx, align 8 %exitcond5 = icmp ne i64 %indvars.iv.next, 10000 } } ``` and the following dependences : ``` { Stmt_for_inc[i0, 9999] -> Stmt_for_end[i0] : 0 <= i0 <= 9999; Stmt_for_inc[i0, i1] -> Stmt_for_inc[i0, 1 + i1] : 0 <= i0 <= 9999 and 0 <= i1 <= 9998; Stmt_for_body[i0] -> Stmt_for_inc[i0, 0] : 0 <= i0 <= 9999; Stmt_for_end[i0] -> Stmt_for_body[1 + i0] : 0 <= i0 <= 9998 } ``` When trying to expand this memory access : ``` { Stmt_for_inc[i0, i1] -> MemRef_tmp_11__phi[] }; ``` The new access map would look like this : ``` { Stmt_for_inc[i0, 9999] -> MemRef_tmp_11__phi_exp[i0] : 0 <= i0 <= 9999; Stmt_for_inc[i0, i1] ->MemRef_tmp_11__phi_exp[i0, 1 + i1] : 0 <= i0 <= 9999 and 0 <= i1 <= 9998 } ``` The idea to implement the expansion for PHI access is an idea from @Meinersbur and I don't understand why my implementation does not work. I should have miss something in the understanding of the idea. Contributed by: Nicolas Bonfante <nicolas.bonfante@gmail.com> Reviewers: Meinersbur, simbuerg, bollu Reviewed By: Meinersbur Subscribers: llvm-commits, pollydev, Meinersbur Differential Revision: https://reviews.llvm.org/D36647 llvm-svn: 311619	2017-08-24 00:04:45 +00:00
Michael Kruse	06ed529205	Add more statistics. Add statistics about - Which optimizations are applied - Number of loops in Scops at various stages - Number of scalar/singleton writes at various stages representative for scalar false dependencies - Number of parallel loops These will be useful to find regressions due to moving Polly further down of LLVM's pass pipeline. Differential Revision: https://reviews.llvm.org/D37049 llvm-svn: 311553	2017-08-23 13:50:30 +00:00
Michael Kruse	7fac28fa4f	[ScopDetect] Include zero-iteration loops in loop count. Loop with zero iteration are, syntactically, loops. They have been excluded from the loop counter even for the non-profitable counters. This seems to be unintentially as the sentinel value of '0' minimal iterations does exclude such loops. Fix by never considering the iteration count when the sentinel value of 0 is found. This makes the recently added NumTotalLoops couter redundant with NumLoopsOverall, which now is equivalent. Hence, NumTotalLoops is removed as well. Note: The test case 'ScopDetect/statistics.ll' effectively does not check profitability, because -polly-process-unprofitable is passed to all test cases. llvm-svn: 311551	2017-08-23 13:29:59 +00:00
Michael Kruse	99fba1fd52	[ScopInliner] Fix hidden overload warning. NFC. By exposing the the hidden member, but as private. llvm-svn: 311550	2017-08-23 13:07:43 +00:00
Michael Kruse	a1579aab46	[MaximumStaticExpansion] Avoid warning in release builds. Conditionally compile function only used in an assert(). llvm-svn: 311549	2017-08-23 12:50:02 +00:00
Michael Kruse	3044dc51cf	[PPCGCodeGen] Fix compiler warning: '<': signed/unsigned mismatch. NFC. MSVC warns about comparison between a signed and unsigned integer. The rules of C(++) define that an unsigned comparison has to be carried-out in this case. This is unlikely to be intended. Fix by assigning the loop's upper bound to a signed integer first. This also avoids repeated evaluation of the invariant upper bound. llvm-svn: 311548	2017-08-23 12:45:25 +00:00
Michael Kruse	594386e773	[ScopInfo] Remove stray semicolon. NFC. llvm-svn: 311547	2017-08-23 12:34:37 +00:00
Tobias Grosser	d680edfb98	Move include/isl-noexceptions.h to include/isl/isl-noexceptions.h llvm-svn: 311504	2017-08-22 22:04:22 +00:00
Jakub Kuderski	0ac1e585fc	[polly] Fix ScopDetectionDiagnostic test failure caused by r310940 Summary: ScopDetection used to check if a loop withing a region was infinite and emitted a diagnostic in such cases. After r310940 there's no point checking against that situation, as infinite loops don't appear in regions anymore. The test failure was observed on these two polly buildbots: http://lab.llvm.org:8011/builders/polly-arm-linux/builds/8368 http://lab.llvm.org:8011/builders/polly-amd64-linux/builds/10310 This patch XFAILs `ReportLoopHasNoExit.ll` and turns infinite loop detection into an assert. Reviewers: grosser, sanjoy, bollu Reviewed By: grosser Subscribers: efriedma, aemerson, kristof.beyls, dberlin, llvm-commits Tags: #polly Differential Revision: https://reviews.llvm.org/D36776 llvm-svn: 311503	2017-08-22 22:01:53 +00:00
Tobias Grosser	4a07bbe3f6	[IRBuilder] Only emit alias scop metadata for arrays, but not scalars Summary: There is no need to emit alias metadata for scalars, as basicaa will easily distinguish them from arrays. This reduces the size of the metadata we generate. This is especially useful after we moved to -polly-position=before-vectorizer, where a lot more scalar dependences are introduced, which increased the size of the alias analysis metadata and made us commonly reach the limits after which we do not emit alias metadata that have been introduced to prevent quadratic growth of this alias metadata. This improves 2mm performance from 1.5 seconds to 0.17 seconds. Reviewers: Meinersbur, bollu, singam-sanjay Reviewed By: Meinersbur Subscribers: pollydev, llvm-commits Tags: #polly Differential Revision: https://reviews.llvm.org/D37028 llvm-svn: 311498	2017-08-22 21:58:48 +00:00
Eugene Zelenko	bff61d220e	[Polly] Satisfy Clang-format for r311489 changes, but it's weird that Clang-format didn't complain about headers order in previous versions (NFC). llvm-svn: 311494	2017-08-22 21:47:17 +00:00
Eugene Zelenko	0c4c2ce0b0	[Polly] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC). llvm-svn: 311489	2017-08-22 21:25:51 +00:00
Roman Gareev	6bfeba24d3	[NFC] Fix the broken comment. llvm-svn: 311477	2017-08-22 17:43:03 +00:00
Roman Gareev	0956a606ff	Disable the Loop Vectorizer in case of GEMM Currently, in case of GEMM and the pattern matching based optimizations, we use only the SLP Vectorizer out of two LLVM vectorizers. Since the Loop Vectorizer can get in the way of optimal code generation, we disable the Loop Vectorizer for the innermost loop using mark nodes and emitting the corresponding metadata. Reviewed-by: Tobias Grosser <tobias@grosser.es> Differential Revision: https://reviews.llvm.org/D36928 llvm-svn: 311473	2017-08-22 17:38:46 +00:00
Michael Kruse	595b77bc0b	[ScopInfo] Fix typos in comment. NFC. llvm-svn: 311472	2017-08-22 17:32:51 +00:00
Siddharth Bhat	14544a8068	[GPUJIT] Make max managed pointers an environment variable. This was originally a `#define`. It is much easier to play around with this as an environment variable when we run on large programs. Differential Revision: https://reviews.llvm.org/D37012 llvm-svn: 311471	2017-08-22 17:32:27 +00:00
Michael Kruse	a28260f486	[test] Do not pipe binary data to FileCheck. llvm-svn: 311470	2017-08-22 17:09:56 +00:00
Michael Kruse	5b228bbb12	[ScopDetection] Add stat for total number of loops. The total number of loops is useful as a baseline comparing how many loops have been optimized in different configurations. llvm-svn: 311469	2017-08-22 17:09:51 +00:00
Siddharth Bhat	cb5155bf6d	[ManagedMemoryRewrite] Use `unit64_t` to store size, not `int`. llvm-svn: 311440	2017-08-22 09:30:37 +00:00
Siddharth Bhat	603544863f	[ManagedMemoryRewrite] Get size in bytes rather than in bits and dividing by 8. llvm-svn: 311439	2017-08-22 09:27:41 +00:00
Tobias Grosser	6683c81af8	test/GPGPU/invalid-kernel-assert-verifymodule.ll also requires assertions llvm-svn: 311423	2017-08-22 03:12:29 +00:00
Michael Kruse	f281ae5992	[test] Add some test cases for computeArrayUnused. llvm-svn: 311404	2017-08-21 23:04:55 +00:00
Michael Kruse	ade14269cd	[DeLICM] Fix unused zone for writes without in-between read. The implementation of computeArrayUnused did not consider writes without reads before, except for the first write in the SCoP. This caused it to 'forget' writes directly following another write. This patch re-adds the entire reaching defintion of a write that has not been covered before by a read. This fixes Polybench 4.2 2mm where only one of the matrix-multiplication was detected. llvm-svn: 311403	2017-08-21 23:04:45 +00:00
Siddharth Bhat	a8c329b0eb	[ManagedMemoryRewrite] slightly tweak debug output style. [NFC] llvm-svn: 311361	2017-08-21 18:58:33 +00:00
Siddharth Bhat	557ce3a8b0	[ManagedMemoryRewrite] Print reasons for skipping global array to dbgs(). [NFC] llvm-svn: 311360	2017-08-21 18:52:15 +00:00
Tobias Grosser	0dd42512ff	[ZoneAlgorithm] Move computeScalarReachingDefinition to c++ llvm-svn: 311336	2017-08-21 14:19:40 +00:00
Siddharth Bhat	0a198dc18a	[ManagedMemoryRewrite] hide debug output behing DEBUG(...). [NFC] llvm-svn: 311331	2017-08-21 12:51:57 +00:00
Siddharth Bhat	7bc77e87c8	[ScopInfo] Add option to treat all function parameters as dereferencible. Dragonegg generates most function parameters as pointers to the actual parameters. However, it does not mark these parameters with the dereferencable attribute. Polly is conservative when it comes to invariant load hoisting, thus we add runtime checks to invariant load hoisted pointers when we do not know that pointers are dereferencable. This is correct behaviour, but is a performance penalty. Add a flag that allows all pointer parameters to be dereferencable. That way, polly can speculatively load-hoist paramters to functions without runtime checks. Differential Revision: https://reviews.llvm.org/D36461 llvm-svn: 311329	2017-08-21 11:57:04 +00:00
Siddharth Bhat	7b9f5ca27e	[PPCGCodeGeneration] Enable `polly-codegen-perf-monitoring` for PPCGCodegen. This feature was not enabled for `PPCGCodeGeneration`. Now that this is enabled, we can benchmark Scops that have been optimised with `-polly-codegen-ppcg` with the `-polly-codegen-perf-monitoring` option. Differential Revision: https://reviews.llvm.org/D36934 llvm-svn: 311328	2017-08-21 11:44:01 +00:00
Tobias Grosser	b09bd74da8	[GPGPU] Add llvm.powi to the libdevice supported functions These intrinsics are used in COSMO. llvm-svn: 311324	2017-08-21 09:52:08 +00:00
Tobias Grosser	5170b6627a	[GPGPU] Add log / logf to the libdevice supported functions These two functions are used in COSMO llvm-svn: 311322	2017-08-21 09:00:31 +00:00
Michael Kruse	d091bf8d8e	[MatMul] Make MatMul detection independent of internal isl representations. The pattern recognition for MatMul is restrictive. The number of "disjuncts" in the isl_map containing constraint information was previously required to be 1 (as per isl_*_coalesce - which should ideally produce a domain map with a single disjunct, but does not under some circumstances). This was changed and made more flexible. Contributed-by: Annanay Agarwal <cs14btech11001@iith.ac.in> Differential Revision: https://reviews.llvm.org/D36460 llvm-svn: 311302	2017-08-20 21:31:11 +00:00
Siddharth Bhat	9a5a278f78	[GPUJIT] Switch from Runtime API calls for managed memory to Driver API calls. We now load the function pointer for `cuMemAllocManaged` dynamically, so it should be possible to compile `GPUJIT` on non-CUDA systems again. It should now be possible to link on non-cuda systems again. Thanks to Philipp Schaad for noticing this inconsitency. Differential Revision: https://reviews.llvm.org/D36921 llvm-svn: 311289	2017-08-20 13:38:04 +00:00
Tobias Grosser	e32498c9c3	Revert "[GPGPU] Simplify PPCGSCop to reduce compile time [NFC]" We still see some issues with parameter space mismatches. Revert this to get a clean baseline. We will recommit after these issues have been resolved. This reverts commit 0e360a14194f722ded7aa2bc9d4be2ed2efeeb49. llvm-svn: 311268	2017-08-19 23:49:26 +00:00
Tobias Grosser	9041118983	[ManagedMemoryRewrite] Make pass more robust and fix memory issue Instead of using Twines and temporary expressions, we do string manipulation through a std::string. This resolves a memory corruption issue, which likely was caused by twines loosing their underlying string too soon. llvm-svn: 311264	2017-08-19 23:03:45 +00:00
Siddharth Bhat	205a78a6f9	[ManagedMemoryRewrite] Iterate over operands of the expanded instruction, not the constantexpr itself. - We should iterate over `I`, which is `Cur` expanded out to an instruction, and not `Cur` itself. - This is a bugfix. Differential Revision: https://reviews.llvm.org/D36923 llvm-svn: 311261	2017-08-19 20:52:11 +00:00
Tobias Grosser	ecb94a0392	[GPGPU] Correctly initialize array order and fixed_element information Summary: This information is necessary for PPCG to perform correct life range reordering. With these changes applied we can live-range reorder some of the important kernels in COSMO. We also update and rename one test case, which previously could not be optimized and now is optimized thanks to live-range reordering. To preserve test coverage we add a new test case scalar-writes-in-scop-requires-abort.ll, which exercises our automatic abort in case of scalar writes in the kernel. Reviewers: Meinersbur, bollu, singam-sanjay Subscribers: nemanjai, pollydev, llvm-commits, kbarton Tags: #polly Differential Revision: https://reviews.llvm.org/D36929 llvm-svn: 311259	2017-08-19 20:21:22 +00:00
Philipp Schaad	50139f0f38	[PPCG] Only add Kernel argument sizes for OpenCL, not CUDA runtime Kernel argument sizes now only get appended to the kernel launch parameter list if the OpenCL runtime is selected, not if CUDA runtime is chosen. Differential revision: D36925 llvm-svn: 311248	2017-08-19 17:04:57 +00:00
Tobias Grosser	9f2eb24c06	Clarify the intend of the run-time check llvm-svn: 311243	2017-08-19 16:26:39 +00:00
Tobias Grosser	43df2020e7	[GPGPU] Collect parameter dimension used in MemoryAccesses When using -polly-ignore-integer-wrapping and -polly-acc-codegen-managed-memory we add parameter dimensions lazily to the domains, which results in PPCG not including parameter dimensions that are only used in memory accesses in the kernel space. To make sure these parameters are still passed to the kernel, we collect these parameter dimensions and align the kernel's parameter space before code-generating it. llvm-svn: 311239	2017-08-19 12:58:28 +00:00
Tobias Grosser	d5f1fad77c	[Polly] Run early cse + memory SSA to remove redundancies in the input code This allows us to get rid of many identical loads as they commonly appear in Fortran code. llvm-svn: 311231	2017-08-19 08:44:46 +00:00
Andreas Simbuerger	8d5b257d02	[Polly][Bug fix] Wrong dependences filtering during Fully Indexed expansion Summary: When trying to expand memory accesses, the current version of Polly uses statement Level dependences. The actual implementation is not working in case of multiple dependences per statement. For example in the following source code : ``` void mse(double A[Ni], double B[Nj], double C[Nj], double D[Nj]) { int i,j; for (j = 0; j < Ni; j++) { for (int i = 0; i<Nj; i++) S: B[i] = i; for (int i = 0; i<Nj; i++) T: D[i] = i; U: A[j] = B[j]; C[j] = D[j]; } } ``` The statement U has two dependences with S and T. The current version of polly fails during expansion. This patch aims to fix this bug. For that, we use Reference Level dependences to be able to filter dependences according to statement and memory ref. The principle of expansion remains the same as before. We also noticed that we need to bail out if load come after store (at the same position) in same statement. So a check was added to isExpandable. Contributed by: Nicholas Bonfante <nicolas.bonfante@insa-lyon.fr> Reviewers: Meinersbur, simbuerg, bollu Reviewed By: Meinersbur, simbuerg Subscribers: pollydev, llvm-commits Differential Revision: https://reviews.llvm.org/D36791 llvm-svn: 311165	2017-08-18 15:01:18 +00:00
Tobias Grosser	ec02acfb98	[GPGPU] Simplify PPCGSCop to reduce compile time [NFC] Summary: Drop unused parameter dimensions to reduce the size of the sets we are working with. Especially the computed dependences tend to accumulate a lot of parameters that are present in the input memory accesses, but often not necessary to express the actual dependences. As isl represents maps and sets with dense matrices, reducing the dimensionality of isl sets commonly reduces code generation performance. This reduces compile time from 17 to 11 seconds for our test case. While this is not impressive, this patch helped me to identify the previous two performance improvements and additionally also increases readability of the isl data structures we use. Reviewers: Meinersbur, bollu, singam-sanjay Reviewed By: bollu Subscribers: nemanjai, pollydev, llvm-commits, kbarton Tags: #polly Differential Revision: https://reviews.llvm.org/D36869 llvm-svn: 311161	2017-08-18 13:38:12 +00:00
Siddharth Bhat	656e629572	[Polly] [PPCGCodeGeneration] Print current Scop and loop depth in PPCGCodeGen. [NFC] Differential Revision: https://reviews.llvm.org/D36871 llvm-svn: 311158	2017-08-18 13:16:58 +00:00
Tobias Grosser	861a387fac	[GPGPU] Do not create copy statements when targetting managed memory Summary: They are not used and consequently do not even need to be computed. This reduces the overall compile time for our kernel from 1m33s to 17s. Reviewers: Meinersbur, bollu, singam-sanjay Reviewed By: bollu Subscribers: nemanjai, pollydev, llvm-commits, kbarton Tags: #polly Differential Revision: https://reviews.llvm.org/D36868 llvm-svn: 311157	2017-08-18 13:11:05 +00:00
Tobias Grosser	62acb344d0	[GPGPU] Synchronize after each kernel, not each copy out Summary: This change reduces the overall number of synchronize calls for kernels with a lot of output data at the cost of additional synchronize calls for kernels launched in sequence without any device to host transfers in between. As the latter pattern is a lot less frequent, this seems a better tradeoff. Even though the above motivation would be motivation enough, this is just a step towards enabling ppcg to not compute to and from device copy calls at all, which would be incorrect in case we still relied on these calls to place our synchronization statements. Reviewers: Meinersbur, bollu, singam-sanjay Reviewed By: bollu Subscribers: nemanjai, kbarton, pollydev, llvm-commits Tags: #polly Differential Revision: https://reviews.llvm.org/D36867 llvm-svn: 311155	2017-08-18 12:55:58 +00:00
Siddharth Bhat	dd616e9519	[ScpInliner] Move DEBUG-TYPE to below all includes to prevent cross-module interaction. [NFC] This fixes compile errors. llvm-svn: 311130	2017-08-17 22:21:16 +00:00
Tobias Grosser	fa03cb7687	[GPGPU] Only collect the access that belong to an array [NFC] This avoid the construction of very large sets and in many cases also keeps the number of parameters low. As a result, we see a compile time reduction from 5 minutes to only slightly above 1 minute for one of our larger test cases. llvm-svn: 311127	2017-08-17 22:04:53 +00:00
Siddharth Bhat	b46847c035	[ScopInliner] Add a simple Scop-based inliner to polly. We add a ScopInliner pass which inlines functions based on a simple heuristic: Let `g` call `f`. If we can model all of `f` as a Scop, we inline `f` into `g`. This requires `-polly-detect-full-function` to be enabled. So, the pass asserts that `-polly-detect-full-function` is enabled. Differential Revision: https://reviews.llvm.org/D36832 llvm-svn: 311126	2017-08-17 21:57:23 +00:00
Tobias Grosser	d2e57981fd	[GPGPU] Move getExtend to C++ [NFC] llvm-svn: 311123	2017-08-17 21:20:28 +00:00
Siddharth Bhat	a2c4112791	[ManagedMemoryRewrite] Rewrite malloc, free correctly inside `Constant`s. Reuse the machinery built for replacing global arrays to replace malloc/free as well. Example replacement that was missed earlier: ``` call void \ bitcast (void (i8) @free to void (%custom_type)) (%custom_type* %13) ``` - Since the `bitcast` is a `ConstantExpr`, `replaceAllUsesWith` would miss this. We don't miss this anymore. Differential Revision: https://reviews.llvm.org/D36825 llvm-svn: 311121	2017-08-17 20:26:38 +00:00
Tobias Grosser	abc5416be1	[GPGPU] Make test case independent of LLVM names In release builds LLVM may not pass along LLVM names consistently. We make the test cases independent of the LLVM-IR names to avoid spurious test case failures. llvm-svn: 311118	2017-08-17 20:09:02 +00:00
Siddharth Bhat	8a2c07f6d4	[ManagedMemoryRewrite] Learn how to rewrite global arrays, allocas. - If we have global arrays, we would like to rewrite them to global pointers which are allocated using `cudaMallocManaged`. - If we have allocas in a function, we would like to rewrite them to heap-allocations with `cudaMallocManaged` and `cudaFree`. - With these rewrite mechanisms, we can offload _any_ function to the GPU with no code rewrite whatsover. Differential Revision: https://reviews.llvm.org/D36516 llvm-svn: 311080	2017-08-17 11:22:52 +00:00
Tobias Grosser	ed6a4acc7f	Add rewrite by-reference parameter pass Summary: This pass detangles induction variables from functions, which take variables by reference. Most fortran functions compiled with gfortran pass variables by reference. Unfortunately a common pattern, printf calls of induction variables, prevent in this situation the promotion of the induction variable to a register, which again inhibits any kind of loop analysis. To work around this issue we developed a specialized pass which introduces separate alloca slots for known-read-only references, which indicate the mem2reg pass that the induction variables can be promoted to registers and consquently enable SCEV to work. We currently hardcode the information that a function _gfortran_transfer_integer_write does not read its second parameter, as dragonegg does not add the right annotations and we cannot change old dragonegg releases. Hopefully flang will produce the right annotations. Reviewers: Meinersbur, bollu, singam-sanjay Reviewed By: bollu Subscribers: mgorny, pollydev, llvm-commits Tags: #polly Differential Revision: https://reviews.llvm.org/D36800 llvm-svn: 311066	2017-08-17 05:25:08 +00:00
Tobias Grosser	5502eb0986	Add missing 'REQUIRES' line llvm-svn: 311046	2017-08-16 22:02:03 +00:00
Tobias Grosser	e2a45f32dc	[GPGPU] Also record invariant loads as kernel subtree values Before this change kernels that used invariant loads would have resulted in invalid PTX code. llvm-svn: 311042	2017-08-16 21:37:53 +00:00
Michael Kruse	91e55322b9	[ScopInfo] Clarify comment. NFC. Contributed-by: Nandini Singhal <cs15mtech01004@iith.ac.in> Differential Revision: https://reviews.llvm.org/D36760 llvm-svn: 310999	2017-08-16 09:28:02 +00:00
Jakub Kuderski	8fb57125b0	[Polly] XFAIL ReportLoopHasNoExit tests after r310940 ReportLoopHasNoExit started failing after r310940 that added infinite loops to postdominators. The change made regions not contain infinite loops anymore. This patch unbreaks the polly tree by XFAILING the ReportLoopHasNoExit test. Full fix is under review in D36776. llvm-svn: 310980	2017-08-16 00:18:39 +00:00
Siddharth Bhat	bb30377c5a	[Polly] [GPUJIT] Set min size to 1 on CUDA allocation calls. [NFC] Requesting size 0 allocations from `cuMalloc` / `cuMallocManaged` fails. If there is a size 0 allocation that can be statically proved, the we fail at PPCGCodeGeneration. This is because if size 0 allocation could take place, we should not generate code that tries to use this array. However, there are cases where we cannot statically prove this, and at runtime we get a request for 0 bytes of memory. We choose to allocate size 1 to allow the program to continue running. Differential Revision: https://reviews.llvm.org/D36751 llvm-svn: 310941	2017-08-15 18:21:38 +00:00
Tobias Grosser	b8417531dd	[Polly] Move ScopStmt::checkForReductions to islpp. NFC. Reviewers: grosser, bollu Differential Revision: https://reviews.llvm.org/D36714 llvm-svn: 310908	2017-08-15 03:45:55 +00:00
Tobias Grosser	1e09c1363c	Move ScopStmt::getSchedule to islpp. NFC. Reviewers: grosser, Meinersbur, bollu Differential Revision: https://reviews.llvm.org/D36660 llvm-svn: 310815	2017-08-14 06:49:06 +00:00
Tobias Grosser	990cbb4310	[Polly] Move Scop::restrictDomains to islpp. NFC. Reviewers: grosser, Meinersbur, bollu Differential Revision: https://reviews.llvm.org/D36659 llvm-svn: 310814	2017-08-14 06:49:01 +00:00
Tobias Grosser	6e78cc6b12	[ScopInfo] Translate ParameterIds to isl++ llvm-svn: 310795	2017-08-13 17:54:51 +00:00
Reid Kleckner	8d719a27f5	Fix two warnings in polly, -Wmismatched-tags and -Wreorder llvm-svn: 310667	2017-08-10 21:46:22 +00:00
Philip Pfaffe	7b5eaa6f62	Add missing license text to two headers. NFC. llvm-svn: 310612	2017-08-10 15:40:36 +00:00
Philip Pfaffe	c3bcdc2f1a	[JSON] Make the failure to parse a jscop file a hard error Summary: Before, if we fail to parse a jscop file, this will be reported as an error and importing is aborted. However, this isn't actually strong enough, since although the import is aborted, the scop has already been modified and is very likely broken. Instead, make this a hard failure and throw an LLVM error. This new behaviour requires small changes to the tests for the legacy pass, namely using `not` to verify the error. Further, fixed the jscop file for the base_pointer_load_is_inst_inside_invariant_1 testcase. Reviewed By: Meinersbur Split out of D36578. llvm-svn: 310599	2017-08-10 14:53:25 +00:00
Philip Pfaffe	47bf15c34f	[JSON][PM] Port json import/export over to new pm Summary: I pulled out all functionality into static functions, and use those both in the legacy passes and in the new ones. Reviewers: grosser, Meinersbur, bollu Reviewed By: Meinersbur Subscribers: llvm-commits, pollydev Differential Revision: https://reviews.llvm.org/D36578 llvm-svn: 310597	2017-08-10 14:45:09 +00:00
Philip Pfaffe	e18f3f6708	Fix 310555: Require pollyacc instead of asserts llvm-svn: 310595	2017-08-10 14:21:04 +00:00
Philip Pfaffe	0360e5a3c2	Fix r310304: Fix the lit testcases. In opt, Polly passes are only available after -load. llvm-svn: 310581	2017-08-10 10:54:26 +00:00
Tobias Grosser	4db39c4829	Add missing 'REQUIRES' line llvm-svn: 310555	2017-08-10 08:11:47 +00:00
Tobias Grosser	cff9696e11	[GPGPU] Make the ast_build available to block generator This is necessary for partial writes (as used by delicm) to work. llvm-svn: 310553	2017-08-10 08:00:56 +00:00
Philip Pfaffe	f43e7c2e97	[Polly][PM] Improve invalidation in the Scop-Pipeline Summary: During code generation for a Scop we modify the IR of a function. While this shouldn't affect a Scop in the formal sense, the implementation caches various information about the IR such as SCEV expressions for bounds or parameters. This cached information needs to be updated or invalidated. To this end, SPMUpdater allows passes to report when they've invalidated a Scop to the PassManager, which will then flush and recompute all Scops. This in turn invalidates all iterators, so references to Scops shouldn't be held. Reviewers: grosser, Meinersbur, bollu Reviewed By: grosser Subscribers: llvm-commits, pollydev Differential Revision: https://reviews.llvm.org/D36524 llvm-svn: 310551	2017-08-10 07:43:46 +00:00
Siddharth Bhat	9298ff2dee	[ManagedMemoryRewrite] [Polly] Erase original malloc and free. [NFC] We do not need to keep `malloc` and `free` around since they are replaced by `polly_{malloc,free}Managed.` llvm-svn: 310504	2017-08-09 18:19:46 +00:00
Michael Kruse	cd3b9fedc7	Remove dependency of Scop::getStmtFor(Inst) on getStmtFor(BB). NFC. We are working towards removing uses of Scop::getStmtFor(BB). In this patch, we remove dependency of Scop::getStmtFor(Inst) on getStmtFor(BB). To do so, we introduce a map of instructions to their corresponding scop statements and use it to get the instructions' statement. Contributed-by: Nandini Singhal <cs15mtech01004@iith.ac.in> Differential Revision: https://reviews.llvm.org/D35663 llvm-svn: 310494	2017-08-09 16:45:37 +00:00
Siddharth Bhat	5a1f872623	[ManagedMemoryRewrite] Remove test case that was submitted by mistake. [NFC] llvm-svn: 310473	2017-08-09 13:34:54 +00:00
Siddharth Bhat	c4a4af47f3	[ManagedMemoryRewrite] Introduce a new pass to rewrite modules to use managed memory. This pass is useful to automatically convert a codebase that uses malloc/free to use their managed memory counterparts. Currently, rewrite malloc and free to the `polly_{malloc,free}Managed` variants. A future patch will teach ManagedMemoryRewrite to rewrite global arrays as pointers to globally allocated managed memory. Differential Revision: https://reviews.llvm.org/D36513 llvm-svn: 310471	2017-08-09 12:59:23 +00:00
Michael Kruse	40d083956c	[CodeGen] Use isLatestArrayKind(). Codegen with -polly-parallel queried the unmapped MemoryAccess, but only the MemoryKind after mapping is relevant for codegen. This should fix various fails of the perf-x86_64-penryn-O3-polly-parallel-fast buildbot. llvm-svn: 310466	2017-08-09 12:27:51 +00:00
Michael Kruse	36550bac0d	[ForwardOpTree] Set DEBUG_TYPE to "polly-optree". The previous value of "polly-delicm" was forgotten to to be changed when ForwardOpTree was split from DeLICM. Thanks to Tobias for noticing! llvm-svn: 310465	2017-08-09 12:27:35 +00:00
Michael Kruse	630fc7b82a	[ISLTools/ZoneAlgo] Make distributeDomain and filterKnownValInst isl_error_quota proof. distributeDomain() and filterKnownValInst() are used in a scop of ForwardOpTree that limits the number of isl operations. Therefore some isl functions may return null after any operation. Remove assertion that assume non-null results and handle isl_*_foreach returning isl::stat::error. I hope this fixes the crash of the asop buildbot at ihevc_recon.c. llvm-svn: 310461	2017-08-09 11:21:40 +00:00
Michael Kruse	8756b3fbec	[ZoneAlgo] Add motivation for exception. NFC. Suggested-by: Hongbin Zheng <etherzhhb@gmail.com> llvm-svn: 310455	2017-08-09 09:29:15 +00:00
Michael Kruse	a9033aaba2	[ZoneAlgo] Consolditate condition. NFC. No need to create an OptimizationRemarkMissed object if we are not going to use it anyway. llvm-svn: 310454	2017-08-09 09:29:09 +00:00
Siddharth Bhat	34eeabbca3	[PPCGCodeGeneration] Compute element size in bytes for arrays correctly. Previously, we used to compute this with `elementSizeInBits / 8`. This would yield an element size of 0 when the array had element size < 8 in bits. To fix this, ask data layout what the size in bytes should be. Differential Revision: https://reviews.llvm.org/D36459 llvm-svn: 310448	2017-08-09 08:29:16 +00:00
Michael Kruse	235726ee4b	[test] Add descriptions and pseudocode to tests. NFC. llvm-svn: 310385	2017-08-08 17:26:19 +00:00
Michael Kruse	ce67358281	[DeLICM/ZoneAlgo] Remove duplicate code. NFC. DeLICM and ZoneAlgo both implemented filterKnownValInst. Declare ZoneAlgo's version in the header and let DeLCIM use it. llvm-svn: 310381	2017-08-08 17:00:27 +00:00
Roman Gareev	1563f039f5	Use SCEV information for the second level aliasing We introduce another level of alias metadata to distinguish the individual non-aliasing accesses that have inter iteration alias-free base pointers marked with "Inter iteration alias-free" mark nodes. To distinguish two accesses, the comparison of raw pointers representing base pointers is used. In case of, for example, ublas's prod function that implements GEMM, and DeLiCM we can get accesses to same location represented by different raw pointers. Consequently, we create different alias sets that can prevent accesses from, for example, being sinked or hoisted. To avoid the issue, we compare the corresponding SCEV information instead of the corresponding raw pointers. Reviewed-by: Tobias Grosser <tobias@grosser.es> Differential Revision: https://reviews.llvm.org/D35761 llvm-svn: 310380	2017-08-08 16:50:28 +00:00
Roman Gareev	dbde718676	Do not use isl_set_project_out to get all loop prefixes Currently, only convex isolation sets can be efficiently processed by isl. Consequently, as a temporary solution, we use a different algorithm for partial tile isolation that helps to build convex isolation sets in some cases. Reviewed-by: Tobias Grosser <tobias@grosser.es> Differential Revision: https://reviews.llvm.org/D36278 llvm-svn: 310374	2017-08-08 16:15:33 +00:00

1 2 3 4 5 ...

3718 Commits