llvm-project

Commit Graph

Author	SHA1	Message	Date
Michael Kruse	68821a8b91	[ZoneAlgo/ForwardOpTree] Normalize PHIs to their known incoming values. Represent PHIs by their incoming values instead of an opaque value of themselves. This allows ForwardOpTree to "look through" the PHIs and forward the incoming values since forwardings PHIs is currently not supported. This is particularly useful to cope with PHIs inserted by GVN LoadPRE. The incoming values all resolve to a load from a single array element which then can be forwarded. It should in theory also reduce spurious conflicts in value mapping (DeLICM), but I have not yet found a profitable case yet, so it is not included here. To avoid transitive closure and potentially necessary overapproximations of those, PHIs that may reference themselves are excluded from normalization and keep their opaque self-representation. Differential Revision: https://reviews.llvm.org/D39333 llvm-svn: 317008	2017-10-31 16:11:46 +00:00
Michael Kruse	ff426d974d	[DeLICM] Fix wrong assumed access execution order. ForwardOpTree may already transform a scalar access to an array accesses. The access remains implicit (isOriginalScalarKind(), meaning that the access is always executed at the begin/end of a statement), but targets an array (isLatestArrayKind(), which is unrelated to whether the execution is implicit/explicit). Fix by properly using isOriginalXXX() to determine execution order. This fixes the buildbots on MultiSource/Benchmarks/DOE-ProxyApps-C/miniGMG. llvm-svn: 316995	2017-10-31 12:50:25 +00:00
Michael Kruse	06618bf71a	[OpenMP] Fix reference collection of latest base ptrs. When collecting base pointers that need to be made available in parallel subfunctions, use the base pointer associated with the latest ScopArrayInfo, instead of the original one. llvm-svn: 316983	2017-10-31 10:28:22 +00:00
Philip Pfaffe	9b1d1e6ae7	Fix two testcases. NFC intended. Add missing %loadPolly directive to support out of tree builds. One of the changes is somewhat bigger, because the directive turns on LLVM names, and the testcase deosn't use those. llvm-svn: 316870	2017-10-29 21:00:48 +00:00
Michael Kruse	822dfe271b	[ForwardOpTree] Reload know values. For scalar accesses, change the access target to an array element that is known to contain the same value. This may become an alternative to forwardKnownLoad which creates new loads (and therefore closer to forwarding speculatives). Reloading does not require the known value originating from a load, but can be a store as well. Differential Revision: https://reviews.llvm.org/D39325 llvm-svn: 316766	2017-10-27 14:26:14 +00:00
Michael Kruse	b6b65834a1	[Simplify] Mark (and sweep) based on latest access relation. Previously we marked scalars based on the original access function. However, when a scalar read access is redirected, the original definition (or incoming values of a PHI) is not used anymore, and can be deleted (unless referenced by use that has not been redirected). llvm-svn: 316660	2017-10-26 12:34:36 +00:00
Michael Kruse	37d57dac63	[DeLICM] Add more tests for loop layouts. NFC. llvm-svn: 316642	2017-10-26 08:03:28 +00:00
Michael Kruse	19cd61dc11	[DeLICM] Do not try to map to multiple array elements. Add check and skip when the store used to determine the target accesses multiple array elements. Only a single array location should for mapping the scalar. Having multiple creates problems when deciding which element to load from. While MemoryAccess::getAddressFunction() should select just one of them, other problems arise in code that assumes that there is just one target element per statement instance. This fixes llvm.org/PR34989 This also reverts r313902 which fixed llvm.org/PR34485 also caused by a non-functional target array element. This patch avoids the situation to occur in the first place. llvm-svn: 316432	2017-10-24 13:05:24 +00:00
Anna Thomas	0026d91437	[Polly] Add XFAIL to large-numbers-in-boundary-context.ll After rL315683 (improve SCEV to calculate max BETakenCount when end bound of loop is variant and loop is of form {Start,+1, Stride} LT End) this test in polly started failing. However, as discussed in https://reviews.llvm.org/rL315683, this polly test is not a loops bound test and the MaxBECount calculated by SCEV looks correct. The max BECount is the value calculated even when the end bound of loop is invariant. As discussed with Tobias offline, I'm marking this as an XFAIL, until he gets a chance to update the testcase, so the build bot goes to green. llvm-svn: 315912	2017-10-16 15:12:39 +00:00
Michael Kruse	cc345e6e94	[ScopBuilder] Introduce -polly-stmt-granularity=scalar-indep option. The option splits BasicBlocks into minimal statements such that no additional scalar dependencies are introduced. The algorithm is based on a union-find structure, and unites sets if putting them into separate statements would introduce a scalar dependencies. As a consequence, instructions may be split into separate statements such their relative order is different than the statements they are in. This is accounted for instructions whose relative order matters (e.g. memory accesses). The algorithm is generic in that heuristic changes can be made relatively easily. We might relax the order requirement for read-reads or accesses to different base pointers. Forwardable instructions can be made to not cause a join. This implementation gives us a speed-up of 82% in SPEC 2006 456.hmmer benchmark by allowing loop-distribution in a hot loop such that one of the loops can be vectorized. Differential Revision: https://reviews.llvm.org/D38403 llvm-svn: 314983	2017-10-05 13:43:00 +00:00
Tobias Grosser	c52b71db15	[GPGPU] Make sure escaping invariant load hoisted scalars are preserved We make sure that the final reload of an invariant scalar memory access uses the same stack slot into which the invariant memory access was stored originally. Earlier, this was broken as we introduce a new stack slot aside of the preload stack slot, which remained uninitialized and caused our escaping loads to contain garbage. This happened due to us clearing the pre-populated values in EscapeMap after kernel code generation. We address this issue by preserving the original host values and restoring them after kernel code generation. EscapeMap is not expected to be used during kernel code generation, hence we clear it during kernel generation to make sure that any unintended uses are noticed. llvm-svn: 314894	2017-10-04 10:24:23 +00:00
Jakub Kuderski	119753ad14	UnXFAIL tests that previously failed VerifyDFSNumbers They started passing again by the DT::eraseNode fix in r314847. llvm-svn: 314850	2017-10-03 21:23:56 +00:00
Jakub Kuderski	3c3bf74022	XFAIL two test that fail VerifyDFSNumbers DominatorTree check This test XFAILs two test that start to fail when verifying DT's DFS numbers, as per Tobias' suggestion. Related VerifyDFSNumbers patch: D38331. llvm-svn: 314800	2017-10-03 14:31:53 +00:00
Michael Kruse	f5745b4e7d	[ScopBuilder] Build invariant loads separately. Create the MemoryAccesses of invariant loads separately and before all other MemoryAccesses. Invariant loads are classified as synthesizable and therefore are not contained in any statement. When iterating over all instructions of all statements, the invariant loads are consequently not processed and iterating over them separately becomes necessary. This patch can change the order in which MemoryAccesses are created, but otherwise has no functional change. Some temporary code is introduced to ensure correctness, but will be removed in the next commit. llvm-svn: 314664	2017-10-02 11:41:27 +00:00
Michael Kruse	89a6f3db02	[ScopBuilder] Build escaping dependencies separately. Instructions that compute escaping values might be synthesizable and therefore not contained in any ScopStmt. When buildAccessFunctions is changed to only iterate over the instruction list of statement, "free" instructions still need to be written. We do this after the main MemoryAccesses have been created. This can change the order in which MemoryAccesses are created, but has otherwise no functional change. llvm-svn: 314663	2017-10-02 11:41:19 +00:00
Michael Kruse	c013399197	[ScopDetect] Do not add loads out of the SCoP to required invariant loads. Loads before the SCoP are always invariant within the SCoP and therefore are no "required invariant loads". An assertion failes in ScopBuilder when it finds such an invariant load. Fix by not adding such loads to the required invariant load list. This likely will cause the region to be not considered a valid SCoP. We may want to unconditionally accept instructions defined before the region as valid invariant conditions instead of rejecting them. This fixes a compilation crash of SPEC CPU2006 453.povray's render.cpp. llvm-svn: 314636	2017-10-01 22:19:28 +00:00
Tobias Grosser	d215e684b3	Add missing REQUIRES line llvm-svn: 314625	2017-10-01 13:14:40 +00:00
Tobias Grosser	2fb847fbf6	[GPGPU] Set Polly's RTC to false in case invariant load hoisting fails This matches the behavior we already have in lib/Codegen/CodeGeneration.cpp and makes sure that we fall back to the original code. It seems when invariant load hoisting was introduced to the GPGPU backend we missed to reset the RTC flag, such that kernels where invariant load hoisting failed executed the 'optimized' SCoP, which however is set to a simple 'unreachable'. Unsurprisingly, this results in hard to debug issues that are a lot of fun to debug. llvm-svn: 314624	2017-10-01 12:39:14 +00:00
Tobias Grosser	1f93d0f1f9	[ScopInfo] Allow PHI nodes that reference an error block As long as these PHI nodes are only referenced by terminator instructions. llvm-svn: 314212	2017-09-26 15:00:10 +00:00
Tobias Grosser	5e531dfef4	[ScopInfo] Allow invariant loads in branch conditions In case the value used in a branch condition is a load instruction, assume this load to be invariant. llvm-svn: 314146	2017-09-25 20:27:15 +00:00
Tobias Grosser	0a62b2d887	[ScopInfo] Allow uniform branch conditions If all but one branch come from an error condition and the incoming value from this branch is a constant, we can model this branch. llvm-svn: 314116	2017-09-25 16:37:15 +00:00
Tobias Grosser	ee457594c2	[ScopDetect/Info] Look through PHIs that follow an error block In case a PHI node follows an error block we can assume that the incoming value can only come from the node that is not an error block. As a result, conditions that seemed non-affine before are now in fact affine. This is a recommit of r312663 after fixing test/Isl/CodeGen/phi_after_error_block_outside_of_scop.ll llvm-svn: 314075	2017-09-24 09:25:30 +00:00
Tobias Grosser	75d133f0ac	[IslExprBuilder] Do not generate RTC with more than 64 bit Such RTCs may introduce integer wrapping intrinsics with more than 64 bit, which are translated to library calls on AOSP that are not part of the runtime and will consequently cause linker errors. Thanks to Eli Friedman for reporting this issue and reducing the test case. llvm-svn: 314065	2017-09-23 15:32:07 +00:00
Michael Kruse	bfca5f4334	[DeLICM] Allow non-injective PHIRead->PHIWrite mapping. Remove an assertion that tests the injectivity of the PHIRead -> PHIWrite relation. That is, allow a single PHI write to be used by multiple PHI reads. This may happen due to some statements containing the PHI write not having the statement instances that would overwrite the previous incoming value due to (assumed/invalid) contexts. This result in that PHI write is mapped to multiple targets which is not supported. Codegen will select one one of the targets using getAddressFunction(). However, the runtime check should protect us from this case ever being executed. We therefore allow injective PHI relations. Additional calculations to detect/santitize this case would probably not be worth the compuational effort. This fixes llvm.org/PR34485 llvm-svn: 313902	2017-09-21 19:08:23 +00:00
Michael Kruse	6d7a7896ce	[ScopInfo] Use map for value def/PHI read accesses. Before this patch, ScopInfo::getValueDef(SAI) used getStmtFor(Instruction*) to find the MemoryAccess that writes a MemoryKind::Value. In cases where the value is synthesizable within the statement that defines, the instruction is not added to the statement's instruction list, which means getStmtFor() won't return anything. If the synthesiable instruction is not synthesiable in a different statement (due to being defined in a loop that and ScalarEvolution cannot derive its escape value), we still need a MemoryKind::Value and a write to it that makes it available in the other statements. Introduce a separate map for this purpose. This fixes MultiSource/Benchmarks/MallocBench/cfrac where -polly-simplify could not find the writing MemoryAccess for a use. The write was not marked as required and consequently was removed. Because this could in principle happen as well for PHI scalars, add such a map for PHI reads as well. llvm-svn: 313881	2017-09-21 14:23:11 +00:00
Michael Kruse	0e370cf1a7	Check whether IslAstInfo and DependenceInfo were computed for the same Scop. Since -polly-codegen reports itself to preserve DependenceInfo and IslAstInfo, we might get those analysis that were computed by a different ScopInfo for a different Scop structure. This would be unfortunate because DependenceInfo and IslAstInfo hold references to resources allocated by ScopInfo/ScopBuilder/Scop (e.g. isl_id). If -polly-codegen and DependenceInfo/IslAstInfo do not agree on which Scop to use, unpredictable things can happen. When the ScopInfo/Scop object is freed, there is a high probability that the new ScopInfo/Scop object will be created at the same heap position with the same address. Comparing whether the Scop or ScopInfo address is the expected therefore is unreliable. Instead, we compare the address of the isl_ctx object. Both, DependenceInfo and IslAstInfo must hold a reference to the isl_ctx object to ensure it is not freed before the destruction of those analyses which might happen after the destruction of the Scop/ScopInfo they refer to. Hence, the isl_ctx will not be freed and its address not reused as long there is a DependenceInfo or IslAstInfo around. This fixes llvm.org/PR34441 llvm-svn: 313842	2017-09-21 00:01:13 +00:00
Michael Kruse	8dceb76066	[ScheduleOptimizer] Fix and test schedule tree statistics. Fix walking over the schedule tree to collect its properties (Number of permutable bands etc.). Also add regression tests for these statistics. llvm-svn: 313750	2017-09-20 11:53:05 +00:00
Michael Kruse	ef8325ba50	[ForwardOpTree] Test the max operations quota. cl::opt<unsigned long> is not specialized and hence the option -polly-optree-max-ops impossible to use. Replace by supported option cl::opt<unsigned>. Also check for an error state when computing the written value, which happens when the quota runs out. llvm-svn: 313546	2017-09-18 17:43:50 +00:00
Michael Kruse	eac3eebfea	[test] Enable -polly-codegen-verify for regression tests. In r301670 IR verification was disabled. Since then, CodeGen writing malformed IR would only be noticed by unpredictable behavior in follow-up passes (e.g. segfaults, infinite loops) or IR verification in the backend assert builds. Re-enable -polly-codegen-verify at for the regression tests to ensure that malformed IR is detected where Polly generated malformed IR in the past and changes in CodeGen are at least partially covered by check-polly (otherwise malformed IR may only get noticed when the buildbots run the test-suite). Differential Revision: https://reviews.llvm.org/D37969 llvm-svn: 313527	2017-09-18 12:34:11 +00:00
Zachary Turner	ce92db13ea	Resubmit "[lit] Force site configs to run before source-tree configs" This is a resubmission of r313270. It broke standalone builds of compiler-rt because we were not correctly generating the llvm-lit script in the standalone build directory. The fixes incorporated here attempt to find llvm/utils/llvm-lit from the source tree returned by llvm-config. If present, it will generate llvm-lit into the output directory. Regardless, the user can specify -DLLVM_EXTERNAL_LIT to point to a specific lit.py on their file system. This supports the use case of someone installing lit via a package manager. If it cannot find a source tree, and -DLLVM_EXTERNAL_LIT is either unspecified or invalid, then we print a warning that tests will not be able to run. Differential Revision: https://reviews.llvm.org/D37756 llvm-svn: 313407	2017-09-15 22:10:46 +00:00
Zachary Turner	83dcb68468	Revert "[lit] Force site configs to run before source-tree configs" This patch is still breaking several multi-stage compiler-rt bots. I already know what the fix is, but I want to get the bots green for now and then try re-applying in the morning. llvm-svn: 313335	2017-09-15 02:56:40 +00:00
Zachary Turner	a0e55b6403	[lit] Force site configs to be run before source-tree configs This patch simplifies LLVM's lit infrastructure by enforcing an ordering that a site config is always run before a source-tree config. A significant amount of the complexity from lit config files arises from the fact that inside of a source-tree config file, we don't yet know if the site config has been run. However it is always required to run a site config first, because it passes various variables down through CMake that the main config depends on. As a result, every config file has to do a bunch of magic to try to reverse-engineer the location of the site config file if they detect (heuristically) that the site config file has not yet been run. This patch solves the problem by emitting a mapping from source tree config file to binary tree site config file in llvm-lit.py. Then, during discovery when we find a config file, we check to see if we have a target mapping for it, and if so we use that instead. This mechanism is generic enough that it does not affect external users of lit. They will just not have a config mapping defined, and everything will work as normal. On the other hand, for us it allows us to make many simplifications: * We are guaranteed that a site config will be executed first * Inside of a main config, we no longer have to assume that attributes might not be present and use getattr everywhere. * We no longer have to pass parameters such as --param llvm_site_config=<path> on the command line. * It is future-proof, meaning you don't have to edit llvm-lit.in to add support for new projects. * All of the duplicated logic of trying various fallback mechanisms of finding a site config from the main config are now gone. One potentially noteworthy thing that was required to implement this change is that whereas the ninja check targets previously used the first method to spawn lit, they now use the second. In particular, you can no longer run lit.py against the source tree while specifying the various `foo_site_config=<path>` parameters. Instead, you need to run llvm-lit.py. Differential Revision: https://reviews.llvm.org/D37756 llvm-svn: 313270	2017-09-14 16:47:58 +00:00
Roman Gareev	925ce50f1b	Unroll and separate the remaining parts of isolation The remaining parts produced by the full partial tile isolation can contain hot spots that are worth to be optimized. Currently, we rely on the simple loop unrolling pass, LiCM and the SLP vectorizer to optimize such parts. However, the approach can suffer from the lack of the information about aliasing that Polly provides using additional alias metadata or/and the lack of the information required by simple loop unrolling pass. This patch is the first step to optimize the remaining parts. To do it, we unroll and separate them. In case of, for instance, Intel Kaby Lake, it helps to increase the performance of the generated code from 39.87 GFlop/s to 49.23 GFlop/s. The next possible step is to avoid unrolling performed by Polly in case of isolated and remaining parts and rely only on simple loop unrolling pass and the Loop vectorizer. Reviewed-by: Tobias Grosser <tobias@grosser.es> Differential Revision: https://reviews.llvm.org/D37692 llvm-svn: 312929	2017-09-11 17:46:47 +00:00
Michael Kruse	2f5cbc449a	[CodeGen] Bitcast scalar writes to actual value. The type of NewValue might change due to ScalarEvolution looking though bitcasts. The synthesized NewValue therefore becomes the type before the bitcast. llvm-svn: 312718	2017-09-07 12:15:01 +00:00
Michael Kruse	8ee179d3b4	Revert "[ScopDetect/Info] Look through PHIs that follow an error block" This reverts commit r312410 - [ScopDetect/Info] Look through PHIs that follow an error block The commit caused generation of invalid IR due to accessing a parameter that does not dominate the SCoP. llvm-svn: 312663	2017-09-06 19:05:40 +00:00
Michael Kruse	48c726f925	[test] Add forgotten REQUIRES: line. llvm-svn: 312632	2017-09-06 13:11:24 +00:00
Michael Kruse	bd84ce8931	[ZoneAlgo] Handle non-StoreInst/LoadInst MemoryAccesses including memset. Up to now ZoneAlgo considered array elements access by something else than a LoadInst or StoreInst as not analyzable. This patch removes that restriction by using the unknown ValInst to describe the written content, repectively the element type's null value in case of memset. Differential Revision: https://reviews.llvm.org/D37362 llvm-svn: 312630	2017-09-06 12:40:55 +00:00
Michael Kruse	420c4863a9	[Simplify] Actually remove unsed instruction from region header. Since r312249 instructions of a entry block of region statements are not marked as root anymore and hence can theoretically be removed if unused. Theoretically, because the instruction list was not changed. Still, MemoryAccesses for unused instructions were removed. This lead to a failed assertion in the code generator when the MemoryAccess for the still listed instruction was not found. This hould fix the Assertion failed: ArrayAccess && "No array access found for instruction!", file ScopInfo.h, line 1494 compiler crashes. llvm-svn: 312566	2017-09-05 19:44:39 +00:00
Tobias Grosser	d6e0679c4e	[ForwardOp] Remove read accesses for all instructions that have been moved Before this patch, OpTree did not consider forwarding an operand tree consisting of only single LoadInst as useful. The motivation was that, like an access to a read-only variable, it would just replace one MemoryAccess by another. However, in contrast to read-only accesses, this would replace a scalar access by an array access, which is something worth doing. In addition, leaving scalar MemoryAccess is problematic in that VirtualUse prioritizes inter-Stmt use over intra-Stmt. It was possible that the same LLVM value has a MemoryAccess for accessing the remote Stmt's LoadInst as well as having the same LoadInst in its own instruction list (due to being forwarded from another operand tree). With this patch we ensure that if a LoadInst is forwarded is any operand tree, also the operand tree containing just the LoadInst is forwarded as well, which effectively removes the scalar MemoryAccess such that only the array access remains, not both. Thanks Michael for the detailed explanation. Reviewers: Meinersbur, bellu, singam-sanjay, gareevroman Subscribers: hfinkel, pollydev, llvm-commits Tags: #polly Differential Revision: https://reviews.llvm.org/D37424 llvm-svn: 312456	2017-09-03 19:52:15 +00:00
Tobias Grosser	701d943d12	[IslAst] Do not assert in case of empty min/max alias locations In certain situations, the context in the isl_ast_build could result for the min/max locations of our alias sets to become empty, which would cause an internal error in isl, which is then unable to derive a value for these expressions. Check these conditions before code generating expressions and instead assume that alias check succeeded. This is valid, as the corresponding memory accesses will not be executed under any valid context. This fixed llvm.org/PR34432. Thanks to Qirun Zhang for reporting. llvm-svn: 312455	2017-09-03 19:47:19 +00:00
Tobias Grosser	99ccf05694	[ScopHelper] Do not crash on unreachable blocks This resolves llvm.org/PR34433. Thanks to Zhendong Su for reporting. llvm-svn: 312451	2017-09-03 18:01:22 +00:00
Tobias Grosser	4baedc70d1	[ScopDetect/Info] Look through PHIs that follow an error block In case a PHI node follows an error block we can assume that the incoming value can only come from the node that is not an error block. As a result, conditions that seemed non-affine before are now in fact affine. llvm-svn: 312410	2017-09-02 08:25:55 +00:00
Siddharth Bhat	3928e3f50a	[ISLNodeBuilder] Materialize Fortran array sizes of arrays without memory accesses. In Polly, we specifically add a paramter to represent the outermost dimension size of fortran arrays. We do this because this information is statically available from the fortran metadata generated by dragonegg. However, we were only materializing these parameters (meaning, creating an llvm::Value to back the isl_id) from memory accesses. This is wrong, we should materialize parameters from scop array info. It is wrong because if there is a case where we detect 2 fortran arrays, but only one of them is accessed, we may not materialize the other array's dimensions at all. This is incorrect. We fix this by looping over all `polly::ScopArrayInfo` in a scop, rather that just all `polly::MemoryAccess`. Differential Revision: https://reviews.llvm.org/D37379 llvm-svn: 312350	2017-09-01 18:55:43 +00:00
Michael Kruse	0c6c555beb	Fix Memory Access of failing tests. Mark scalar dependences for different statements belonging to same BB as 'Inter'. Contributed-by: Nandini Singhal <cs15mtech01004@iith.ac.in> Differential Revision: https://reviews.llvm.org/D37147 llvm-svn: 312324	2017-09-01 11:36:52 +00:00
Tobias Grosser	2307f86c47	[ForwardOpTree] Allow forwarding in the presence of region statements Summary: After region statements now also have instruction lists, this is a straightforward extension. Reviewers: Meinersbur, bollu, singam-sanjay, gareevroman Reviewed By: Meinersbur Subscribers: hfinkel, pollydev, llvm-commits Tags: #polly Differential Revision: https://reviews.llvm.org/D37298 llvm-svn: 312249	2017-08-31 16:04:49 +00:00
Siddharth Bhat	56572c6a5e	[PPCGCodeGen] Convert intrinsics to libdevice functions whenever possible. This is useful when we face certain intrinsics such as `llvm.exp.*` which cannot be lowered by the NVPTX backend while other intrinsics can. So, we would need to keep blacklists of intrinsics that cannot be handled by the NVPTX backend. It is much simpler to try and promote all intrinsics to libdevice versions. This patch makes function/intrinsic very uniform, and will always try to use a libdevice version if it exists. Differential Revision: https://reviews.llvm.org/D37056 llvm-svn: 312239	2017-08-31 13:03:37 +00:00
Tobias Grosser	c43d0360cc	[BlockGenerator] Generate entry block of regions from instruction lists The adds code generation support for the previous commit. This patch has been re-applied, after the memory issue in the previous patch has been fixed. llvm-svn: 312211	2017-08-31 03:17:35 +00:00
Tobias Grosser	bd15d13d4e	[ScopInfo] Use statement lists for entry blocks of region statements By using statement lists in the entry blocks of region statements, instruction level analyses also work on region statements. We currently only model the entry block of a region statements, as this is sufficient for most transformations the known-passes currently execute. Modeling instructions in the presence of control flow (e.g. infinite loops) is left out to not increase code complexity too much. It can be added when good use cases are found. This change set is reapplied, after a memory corruption issue had been fixed. llvm-svn: 312210	2017-08-31 03:15:56 +00:00
Tobias Grosser	d3edc16416	Revert "[ScopInfo] Use statement lists for entry blocks of region statements" This reverts commit r312128. It aused some memory issues. llvm-svn: 312209	2017-08-31 02:43:49 +00:00
Tobias Grosser	6f1f5cbb5b	Revert "[BlockGenerator] Generate entry block of regions from instruction lists" This reverts commit r312129. It caused some memory issues. llvm-svn: 312208	2017-08-31 02:43:27 +00:00
Adrian Prantl	6120801066	Adapt testcase to LLVM change in DIGlobalVariableExpression. llvm-svn: 312147	2017-08-30 18:12:35 +00:00
Tobias Grosser	1e34508bcc	[BlockGenerator] Generate entry block of regions from instruction lists The adds code generation support for the previous commit. llvm-svn: 312129	2017-08-30 15:08:30 +00:00
Tobias Grosser	6fbe4c8501	[ScopInfo] Use statement lists for entry blocks of region statements By using statement lists in the entry blocks of region statements, instruction level analyses also work on region statements. We currently only model the entry block of a region statements, as this is sufficient for most transformations the known-passes currently execute. Modeling instructions in the presence of control flow (e.g. infinite loops) is left out to not increase code complexity too much. It can be added when good use cases are found. llvm-svn: 312128	2017-08-30 15:08:21 +00:00
Michael Kruse	591255183b	[ScopBuilder] Introduce metadata for splitting scop statement. This patch allows annotating of metadata in ir instruction (with "polly_split_after"), which specifies where to split a particular scop statement. Contributed-by: Nandini Singhal <cs15mtech01004@iith.ac.in> Differential Revision: https://reviews.llvm.org/D36402 llvm-svn: 312107	2017-08-30 10:11:06 +00:00
Michael Kruse	4728184342	[ZoneAlgo] More fine-grained bail-out. ZoneAlgo used to bail out for the complete SCoP if it encountered something violating its assumption. This meant the neither OpTree can forward any load nor DeLICM do anything in such cases, even if their transformations are unrelated to the violations. This patch adds a list of compatible elements (currently with the granularity of entire arrays) that can be used for analysis. OpTree and DeLICM can then check whether their transformations only concern compatible elements, and skip non-compatible ones. This will be useful for e.g. Polybench's benchmarks covariance, correlation, bicg, doitgen, durbin, gramschmidt, adi that have assumption violation, but which are not necessarily relevant for all transformations. Differential Revision: https://reviews.llvm.org/D37219 llvm-svn: 311929	2017-08-28 20:39:07 +00:00
Tobias Grosser	ee8ad1c0ff	[IslAst] Do not compare arrays in alias check which are known to be identical This possibly helps to avoid run-time check failures in the COSMO kernels. llvm-svn: 311920	2017-08-28 20:17:02 +00:00
Tobias Grosser	93ab558d2e	[Detect] Consider nested loop profitable if entry block is not in loop In cases where the entry block of a scop was not contained in a loop that was part of the scop region and at the same time there was a loop surrounding the scop, we missed to count the loops in the scop and consequently did not consider the scop profitable. We correct this by only moving to the loop parent, in case the current loop is loop contained in the scop. This increases the number of loops in COSMO which we assume to be profitable from 3974 to 4981. llvm-svn: 311863	2017-08-27 21:39:25 +00:00
Tobias Grosser	6d0970f64e	Revert "[polly] Fix ScopDetectionDiagnostic test failure caused by r310940" This reverts commit 950849ece9bb8fdd2b41e3ec348b9653b4e37df6. This commit broke various buildbots. llvm-svn: 311692	2017-08-24 19:47:15 +00:00
Michael Kruse	b795bfc0d4	[CodeGen] Detect impossible partial write conditions more reliably. Whether a partial write is tautological/unsatisfiable not only depends on the access domain, but also on the domain covered by its node in the AST. In the example below, there are two instances of Stmt_cond_false. It may have a partial write access that is not executed in instance Stmt_cond_false(0). for (int c0 = 0; c0 < tmp5; c0 += 1) { Stmt_for_body344(c0); if (tmp5 >= c0 + 2) Stmt_cond_false(c0); Stmt_cond_end(c0); } if (tmp5 <= 0) { Stmt_for_body344(0); Stmt_cond_false(0); Stmt_cond_end(0); } Isl cannot derive a subscript for an array element that is never accessed. This caused an error in that no subscript expression has been generated in IslNodeBuilder::createNewAccesses, but BlockGenerator expected one to exist because there is an execution of that write, just not in that ast node. Fixed by instead of determining whether the access domain is empty, inspect whether isl generated a constant "false" ast expression in the current ast node. This should fix a compiler crash of the aosp buildbot. llvm-svn: 311663	2017-08-24 14:51:35 +00:00
Andreas Simbuerger	e478e2de83	[Polly][WIP] Scalar fully indexed expansion Summary: This patch comes directly after https://reviews.llvm.org/D34982 which allows fully indexed expansion of MemoryKind::Array. This patch allows expansion for MemoryKind::Value and MemoryKind::PHI. MemoryKind::Value seems to be working with no majors modifications of D34982. A test case has been added. Unfortunatly, no "run time" checks can be done for now because as @Meinersbur explains in a comment on D34982, DependenceInfo need to be cleared and reset to take expansion into account in the remaining part of the Polly pipeline. There is no way to do that in Polly for now. MemoryKind::PHI is not working. Test case is in place, but not working. To expand MemoryKind::Array, we expand first the write and then after the reads. For MemoryKind::PHI, the idea of the current implementation is to exchange the "roles" of the read and write and expand first the read according to its domain and after the writes. But with this strategy, I still encounter the problem of union_map in new access map. For example with the following source code (source code of the test case) : ``` void mse(double A[Ni], double B[Nj]) { int i,j; double tmp = 6; for (i = 0; i < Ni; i++) { for (int j = 0; j<Nj; j++) { tmp = tmp + 2; } B[i] = tmp; } } ``` Polly gives us the following statements and memory accesses : ``` Statements { Stmt_for_body Domain := { Stmt_for_body[i0] : 0 <= i0 <= 9999 }; Schedule := { Stmt_for_body[i0] -> [i0, 0, 0] }; ReadAccess := [Reduction Type: NONE] [Scalar: 1] { Stmt_for_body[i0] -> MemRef_tmp_04__phi[] }; MustWriteAccess := [Reduction Type: NONE] [Scalar: 1] { Stmt_for_body[i0] -> MemRef_tmp_11__phi[] }; Instructions { %tmp.04 = phi double [ 6.000000e+00, %entry.split ], [ %add.lcssa, %for.end ] } Stmt_for_inc Domain := { Stmt_for_inc[i0, i1] : 0 <= i0 <= 9999 and 0 <= i1 <= 9999 }; Schedule := { Stmt_for_inc[i0, i1] -> [i0, 1, i1] }; MustWriteAccess := [Reduction Type: NONE] [Scalar: 1] { Stmt_for_inc[i0, i1] -> MemRef_tmp_11__phi[] }; ReadAccess := [Reduction Type: NONE] [Scalar: 1] { Stmt_for_inc[i0, i1] -> MemRef_tmp_11__phi[] }; MustWriteAccess := [Reduction Type: NONE] [Scalar: 1] { Stmt_for_inc[i0, i1] -> MemRef_add_lcssa__phi[] }; Instructions { %tmp.11 = phi double [ %tmp.04, %for.body ], [ %add, %for.inc ] %add = fadd double %tmp.11, 2.000000e+00 %exitcond = icmp ne i32 %inc, 10000 } Stmt_for_end Domain := { Stmt_for_end[i0] : 0 <= i0 <= 9999 }; Schedule := { Stmt_for_end[i0] -> [i0, 2, 0] }; MustWriteAccess := [Reduction Type: NONE] [Scalar: 1] { Stmt_for_end[i0] -> MemRef_tmp_04__phi[] }; ReadAccess := [Reduction Type: NONE] [Scalar: 1] { Stmt_for_end[i0] -> MemRef_add_lcssa__phi[] }; MustWriteAccess := [Reduction Type: NONE] [Scalar: 0] { Stmt_for_end[i0] -> MemRef_B[i0] }; Instructions { %add.lcssa = phi double [ %add, %for.inc ] store double %add.lcssa, double* %arrayidx, align 8 %exitcond5 = icmp ne i64 %indvars.iv.next, 10000 } } ``` and the following dependences : ``` { Stmt_for_inc[i0, 9999] -> Stmt_for_end[i0] : 0 <= i0 <= 9999; Stmt_for_inc[i0, i1] -> Stmt_for_inc[i0, 1 + i1] : 0 <= i0 <= 9999 and 0 <= i1 <= 9998; Stmt_for_body[i0] -> Stmt_for_inc[i0, 0] : 0 <= i0 <= 9999; Stmt_for_end[i0] -> Stmt_for_body[1 + i0] : 0 <= i0 <= 9998 } ``` When trying to expand this memory access : ``` { Stmt_for_inc[i0, i1] -> MemRef_tmp_11__phi[] }; ``` The new access map would look like this : ``` { Stmt_for_inc[i0, 9999] -> MemRef_tmp_11__phi_exp[i0] : 0 <= i0 <= 9999; Stmt_for_inc[i0, i1] ->MemRef_tmp_11__phi_exp[i0, 1 + i1] : 0 <= i0 <= 9999 and 0 <= i1 <= 9998 } ``` The idea to implement the expansion for PHI access is an idea from @Meinersbur and I don't understand why my implementation does not work. I should have miss something in the understanding of the idea. Contributed by: Nicolas Bonfante <nicolas.bonfante@gmail.com> Reviewers: Meinersbur, simbuerg, bollu Reviewed By: Meinersbur Subscribers: llvm-commits, pollydev, Meinersbur Differential Revision: https://reviews.llvm.org/D36647 llvm-svn: 311619	2017-08-24 00:04:45 +00:00
Michael Kruse	7fac28fa4f	[ScopDetect] Include zero-iteration loops in loop count. Loop with zero iteration are, syntactically, loops. They have been excluded from the loop counter even for the non-profitable counters. This seems to be unintentially as the sentinel value of '0' minimal iterations does exclude such loops. Fix by never considering the iteration count when the sentinel value of 0 is found. This makes the recently added NumTotalLoops couter redundant with NumLoopsOverall, which now is equivalent. Hence, NumTotalLoops is removed as well. Note: The test case 'ScopDetect/statistics.ll' effectively does not check profitability, because -polly-process-unprofitable is passed to all test cases. llvm-svn: 311551	2017-08-23 13:29:59 +00:00
Jakub Kuderski	0ac1e585fc	[polly] Fix ScopDetectionDiagnostic test failure caused by r310940 Summary: ScopDetection used to check if a loop withing a region was infinite and emitted a diagnostic in such cases. After r310940 there's no point checking against that situation, as infinite loops don't appear in regions anymore. The test failure was observed on these two polly buildbots: http://lab.llvm.org:8011/builders/polly-arm-linux/builds/8368 http://lab.llvm.org:8011/builders/polly-amd64-linux/builds/10310 This patch XFAILs `ReportLoopHasNoExit.ll` and turns infinite loop detection into an assert. Reviewers: grosser, sanjoy, bollu Reviewed By: grosser Subscribers: efriedma, aemerson, kristof.beyls, dberlin, llvm-commits Tags: #polly Differential Revision: https://reviews.llvm.org/D36776 llvm-svn: 311503	2017-08-22 22:01:53 +00:00
Tobias Grosser	4a07bbe3f6	[IRBuilder] Only emit alias scop metadata for arrays, but not scalars Summary: There is no need to emit alias metadata for scalars, as basicaa will easily distinguish them from arrays. This reduces the size of the metadata we generate. This is especially useful after we moved to -polly-position=before-vectorizer, where a lot more scalar dependences are introduced, which increased the size of the alias analysis metadata and made us commonly reach the limits after which we do not emit alias metadata that have been introduced to prevent quadratic growth of this alias metadata. This improves 2mm performance from 1.5 seconds to 0.17 seconds. Reviewers: Meinersbur, bollu, singam-sanjay Reviewed By: Meinersbur Subscribers: pollydev, llvm-commits Tags: #polly Differential Revision: https://reviews.llvm.org/D37028 llvm-svn: 311498	2017-08-22 21:58:48 +00:00
Roman Gareev	0956a606ff	Disable the Loop Vectorizer in case of GEMM Currently, in case of GEMM and the pattern matching based optimizations, we use only the SLP Vectorizer out of two LLVM vectorizers. Since the Loop Vectorizer can get in the way of optimal code generation, we disable the Loop Vectorizer for the innermost loop using mark nodes and emitting the corresponding metadata. Reviewed-by: Tobias Grosser <tobias@grosser.es> Differential Revision: https://reviews.llvm.org/D36928 llvm-svn: 311473	2017-08-22 17:38:46 +00:00
Michael Kruse	a28260f486	[test] Do not pipe binary data to FileCheck. llvm-svn: 311470	2017-08-22 17:09:56 +00:00
Michael Kruse	5b228bbb12	[ScopDetection] Add stat for total number of loops. The total number of loops is useful as a baseline comparing how many loops have been optimized in different configurations. llvm-svn: 311469	2017-08-22 17:09:51 +00:00
Tobias Grosser	6683c81af8	test/GPGPU/invalid-kernel-assert-verifymodule.ll also requires assertions llvm-svn: 311423	2017-08-22 03:12:29 +00:00
Siddharth Bhat	7bc77e87c8	[ScopInfo] Add option to treat all function parameters as dereferencible. Dragonegg generates most function parameters as pointers to the actual parameters. However, it does not mark these parameters with the dereferencable attribute. Polly is conservative when it comes to invariant load hoisting, thus we add runtime checks to invariant load hoisted pointers when we do not know that pointers are dereferencable. This is correct behaviour, but is a performance penalty. Add a flag that allows all pointer parameters to be dereferencable. That way, polly can speculatively load-hoist paramters to functions without runtime checks. Differential Revision: https://reviews.llvm.org/D36461 llvm-svn: 311329	2017-08-21 11:57:04 +00:00
Tobias Grosser	b09bd74da8	[GPGPU] Add llvm.powi to the libdevice supported functions These intrinsics are used in COSMO. llvm-svn: 311324	2017-08-21 09:52:08 +00:00
Tobias Grosser	5170b6627a	[GPGPU] Add log / logf to the libdevice supported functions These two functions are used in COSMO llvm-svn: 311322	2017-08-21 09:00:31 +00:00
Michael Kruse	d091bf8d8e	[MatMul] Make MatMul detection independent of internal isl representations. The pattern recognition for MatMul is restrictive. The number of "disjuncts" in the isl_map containing constraint information was previously required to be 1 (as per isl_*_coalesce - which should ideally produce a domain map with a single disjunct, but does not under some circumstances). This was changed and made more flexible. Contributed-by: Annanay Agarwal <cs14btech11001@iith.ac.in> Differential Revision: https://reviews.llvm.org/D36460 llvm-svn: 311302	2017-08-20 21:31:11 +00:00
Tobias Grosser	e32498c9c3	Revert "[GPGPU] Simplify PPCGSCop to reduce compile time [NFC]" We still see some issues with parameter space mismatches. Revert this to get a clean baseline. We will recommit after these issues have been resolved. This reverts commit 0e360a14194f722ded7aa2bc9d4be2ed2efeeb49. llvm-svn: 311268	2017-08-19 23:49:26 +00:00
Tobias Grosser	ecb94a0392	[GPGPU] Correctly initialize array order and fixed_element information Summary: This information is necessary for PPCG to perform correct life range reordering. With these changes applied we can live-range reorder some of the important kernels in COSMO. We also update and rename one test case, which previously could not be optimized and now is optimized thanks to live-range reordering. To preserve test coverage we add a new test case scalar-writes-in-scop-requires-abort.ll, which exercises our automatic abort in case of scalar writes in the kernel. Reviewers: Meinersbur, bollu, singam-sanjay Subscribers: nemanjai, pollydev, llvm-commits, kbarton Tags: #polly Differential Revision: https://reviews.llvm.org/D36929 llvm-svn: 311259	2017-08-19 20:21:22 +00:00
Philipp Schaad	50139f0f38	[PPCG] Only add Kernel argument sizes for OpenCL, not CUDA runtime Kernel argument sizes now only get appended to the kernel launch parameter list if the OpenCL runtime is selected, not if CUDA runtime is chosen. Differential revision: D36925 llvm-svn: 311248	2017-08-19 17:04:57 +00:00
Tobias Grosser	43df2020e7	[GPGPU] Collect parameter dimension used in MemoryAccesses When using -polly-ignore-integer-wrapping and -polly-acc-codegen-managed-memory we add parameter dimensions lazily to the domains, which results in PPCG not including parameter dimensions that are only used in memory accesses in the kernel space. To make sure these parameters are still passed to the kernel, we collect these parameter dimensions and align the kernel's parameter space before code-generating it. llvm-svn: 311239	2017-08-19 12:58:28 +00:00
Andreas Simbuerger	8d5b257d02	[Polly][Bug fix] Wrong dependences filtering during Fully Indexed expansion Summary: When trying to expand memory accesses, the current version of Polly uses statement Level dependences. The actual implementation is not working in case of multiple dependences per statement. For example in the following source code : ``` void mse(double A[Ni], double B[Nj], double C[Nj], double D[Nj]) { int i,j; for (j = 0; j < Ni; j++) { for (int i = 0; i<Nj; i++) S: B[i] = i; for (int i = 0; i<Nj; i++) T: D[i] = i; U: A[j] = B[j]; C[j] = D[j]; } } ``` The statement U has two dependences with S and T. The current version of polly fails during expansion. This patch aims to fix this bug. For that, we use Reference Level dependences to be able to filter dependences according to statement and memory ref. The principle of expansion remains the same as before. We also noticed that we need to bail out if load come after store (at the same position) in same statement. So a check was added to isExpandable. Contributed by: Nicholas Bonfante <nicolas.bonfante@insa-lyon.fr> Reviewers: Meinersbur, simbuerg, bollu Reviewed By: Meinersbur, simbuerg Subscribers: pollydev, llvm-commits Differential Revision: https://reviews.llvm.org/D36791 llvm-svn: 311165	2017-08-18 15:01:18 +00:00
Tobias Grosser	ec02acfb98	[GPGPU] Simplify PPCGSCop to reduce compile time [NFC] Summary: Drop unused parameter dimensions to reduce the size of the sets we are working with. Especially the computed dependences tend to accumulate a lot of parameters that are present in the input memory accesses, but often not necessary to express the actual dependences. As isl represents maps and sets with dense matrices, reducing the dimensionality of isl sets commonly reduces code generation performance. This reduces compile time from 17 to 11 seconds for our test case. While this is not impressive, this patch helped me to identify the previous two performance improvements and additionally also increases readability of the isl data structures we use. Reviewers: Meinersbur, bollu, singam-sanjay Reviewed By: bollu Subscribers: nemanjai, pollydev, llvm-commits, kbarton Tags: #polly Differential Revision: https://reviews.llvm.org/D36869 llvm-svn: 311161	2017-08-18 13:38:12 +00:00
Siddharth Bhat	b46847c035	[ScopInliner] Add a simple Scop-based inliner to polly. We add a ScopInliner pass which inlines functions based on a simple heuristic: Let `g` call `f`. If we can model all of `f` as a Scop, we inline `f` into `g`. This requires `-polly-detect-full-function` to be enabled. So, the pass asserts that `-polly-detect-full-function` is enabled. Differential Revision: https://reviews.llvm.org/D36832 llvm-svn: 311126	2017-08-17 21:57:23 +00:00
Siddharth Bhat	a2c4112791	[ManagedMemoryRewrite] Rewrite malloc, free correctly inside `Constant`s. Reuse the machinery built for replacing global arrays to replace malloc/free as well. Example replacement that was missed earlier: ``` call void \ bitcast (void (i8) @free to void (%custom_type)) (%custom_type* %13) ``` - Since the `bitcast` is a `ConstantExpr`, `replaceAllUsesWith` would miss this. We don't miss this anymore. Differential Revision: https://reviews.llvm.org/D36825 llvm-svn: 311121	2017-08-17 20:26:38 +00:00
Tobias Grosser	abc5416be1	[GPGPU] Make test case independent of LLVM names In release builds LLVM may not pass along LLVM names consistently. We make the test cases independent of the LLVM-IR names to avoid spurious test case failures. llvm-svn: 311118	2017-08-17 20:09:02 +00:00
Siddharth Bhat	8a2c07f6d4	[ManagedMemoryRewrite] Learn how to rewrite global arrays, allocas. - If we have global arrays, we would like to rewrite them to global pointers which are allocated using `cudaMallocManaged`. - If we have allocas in a function, we would like to rewrite them to heap-allocations with `cudaMallocManaged` and `cudaFree`. - With these rewrite mechanisms, we can offload _any_ function to the GPU with no code rewrite whatsover. Differential Revision: https://reviews.llvm.org/D36516 llvm-svn: 311080	2017-08-17 11:22:52 +00:00
Tobias Grosser	ed6a4acc7f	Add rewrite by-reference parameter pass Summary: This pass detangles induction variables from functions, which take variables by reference. Most fortran functions compiled with gfortran pass variables by reference. Unfortunately a common pattern, printf calls of induction variables, prevent in this situation the promotion of the induction variable to a register, which again inhibits any kind of loop analysis. To work around this issue we developed a specialized pass which introduces separate alloca slots for known-read-only references, which indicate the mem2reg pass that the induction variables can be promoted to registers and consquently enable SCEV to work. We currently hardcode the information that a function _gfortran_transfer_integer_write does not read its second parameter, as dragonegg does not add the right annotations and we cannot change old dragonegg releases. Hopefully flang will produce the right annotations. Reviewers: Meinersbur, bollu, singam-sanjay Reviewed By: bollu Subscribers: mgorny, pollydev, llvm-commits Tags: #polly Differential Revision: https://reviews.llvm.org/D36800 llvm-svn: 311066	2017-08-17 05:25:08 +00:00
Tobias Grosser	5502eb0986	Add missing 'REQUIRES' line llvm-svn: 311046	2017-08-16 22:02:03 +00:00
Tobias Grosser	e2a45f32dc	[GPGPU] Also record invariant loads as kernel subtree values Before this change kernels that used invariant loads would have resulted in invalid PTX code. llvm-svn: 311042	2017-08-16 21:37:53 +00:00
Jakub Kuderski	8fb57125b0	[Polly] XFAIL ReportLoopHasNoExit tests after r310940 ReportLoopHasNoExit started failing after r310940 that added infinite loops to postdominators. The change made regions not contain infinite loops anymore. This patch unbreaks the polly tree by XFAILING the ReportLoopHasNoExit test. Full fix is under review in D36776. llvm-svn: 310980	2017-08-16 00:18:39 +00:00
Philip Pfaffe	c3bcdc2f1a	[JSON] Make the failure to parse a jscop file a hard error Summary: Before, if we fail to parse a jscop file, this will be reported as an error and importing is aborted. However, this isn't actually strong enough, since although the import is aborted, the scop has already been modified and is very likely broken. Instead, make this a hard failure and throw an LLVM error. This new behaviour requires small changes to the tests for the legacy pass, namely using `not` to verify the error. Further, fixed the jscop file for the base_pointer_load_is_inst_inside_invariant_1 testcase. Reviewed By: Meinersbur Split out of D36578. llvm-svn: 310599	2017-08-10 14:53:25 +00:00
Philip Pfaffe	e18f3f6708	Fix 310555: Require pollyacc instead of asserts llvm-svn: 310595	2017-08-10 14:21:04 +00:00
Philip Pfaffe	0360e5a3c2	Fix r310304: Fix the lit testcases. In opt, Polly passes are only available after -load. llvm-svn: 310581	2017-08-10 10:54:26 +00:00
Tobias Grosser	4db39c4829	Add missing 'REQUIRES' line llvm-svn: 310555	2017-08-10 08:11:47 +00:00
Tobias Grosser	cff9696e11	[GPGPU] Make the ast_build available to block generator This is necessary for partial writes (as used by delicm) to work. llvm-svn: 310553	2017-08-10 08:00:56 +00:00
Siddharth Bhat	9298ff2dee	[ManagedMemoryRewrite] [Polly] Erase original malloc and free. [NFC] We do not need to keep `malloc` and `free` around since they are replaced by `polly_{malloc,free}Managed.` llvm-svn: 310504	2017-08-09 18:19:46 +00:00
Siddharth Bhat	5a1f872623	[ManagedMemoryRewrite] Remove test case that was submitted by mistake. [NFC] llvm-svn: 310473	2017-08-09 13:34:54 +00:00
Siddharth Bhat	c4a4af47f3	[ManagedMemoryRewrite] Introduce a new pass to rewrite modules to use managed memory. This pass is useful to automatically convert a codebase that uses malloc/free to use their managed memory counterparts. Currently, rewrite malloc and free to the `polly_{malloc,free}Managed` variants. A future patch will teach ManagedMemoryRewrite to rewrite global arrays as pointers to globally allocated managed memory. Differential Revision: https://reviews.llvm.org/D36513 llvm-svn: 310471	2017-08-09 12:59:23 +00:00
Michael Kruse	40d083956c	[CodeGen] Use isLatestArrayKind(). Codegen with -polly-parallel queried the unmapped MemoryAccess, but only the MemoryKind after mapping is relevant for codegen. This should fix various fails of the perf-x86_64-penryn-O3-polly-parallel-fast buildbot. llvm-svn: 310466	2017-08-09 12:27:51 +00:00
Siddharth Bhat	34eeabbca3	[PPCGCodeGeneration] Compute element size in bytes for arrays correctly. Previously, we used to compute this with `elementSizeInBits / 8`. This would yield an element size of 0 when the array had element size < 8 in bits. To fix this, ask data layout what the size in bytes should be. Differential Revision: https://reviews.llvm.org/D36459 llvm-svn: 310448	2017-08-09 08:29:16 +00:00
Michael Kruse	235726ee4b	[test] Add descriptions and pseudocode to tests. NFC. llvm-svn: 310385	2017-08-08 17:26:19 +00:00
Roman Gareev	1563f039f5	Use SCEV information for the second level aliasing We introduce another level of alias metadata to distinguish the individual non-aliasing accesses that have inter iteration alias-free base pointers marked with "Inter iteration alias-free" mark nodes. To distinguish two accesses, the comparison of raw pointers representing base pointers is used. In case of, for example, ublas's prod function that implements GEMM, and DeLiCM we can get accesses to same location represented by different raw pointers. Consequently, we create different alias sets that can prevent accesses from, for example, being sinked or hoisted. To avoid the issue, we compare the corresponding SCEV information instead of the corresponding raw pointers. Reviewed-by: Tobias Grosser <tobias@grosser.es> Differential Revision: https://reviews.llvm.org/D35761 llvm-svn: 310380	2017-08-08 16:50:28 +00:00
Roman Gareev	dbde718676	Do not use isl_set_project_out to get all loop prefixes Currently, only convex isolation sets can be efficiently processed by isl. Consequently, as a temporary solution, we use a different algorithm for partial tile isolation that helps to build convex isolation sets in some cases. Reviewed-by: Tobias Grosser <tobias@grosser.es> Differential Revision: https://reviews.llvm.org/D36278 llvm-svn: 310374	2017-08-08 16:15:33 +00:00
Siddharth Bhat	9aca1cb519	[NFC] [PPCGCodeGen] Add missing REQUIRES: pollyacc line. llvm-svn: 310354	2017-08-08 12:26:37 +00:00
Siddharth Bhat	71dfb3eb07	[Polly] [PPCGCodeGeneration] Handle failing of invariant load hoisting gracefully. To do this, we replicate what `CodeGeneration` does. We expose `markNodeUnreachable` from `CodeGeneration` to `PPCGCodeGeneration`. Differential Revision: https://reviews.llvm.org/D36457 llvm-svn: 310350	2017-08-08 12:00:59 +00:00
Michael Kruse	27c010a22e	[DeLICM] Properly handle PHI writes becoming empty partial writes. It is possible that partial writes are empty (write is never executed). In this case, when in PHINode's incoming edge is never taken such that the incoming write becomes an empty partial write, if enabled. The issue is that when converting the union_map to an map, it's space cannot be derived from the union_map itself. Rather, we need to determine its space independently. This fixes test-suite's MultiSource/Benchmarks/ASC_Sequoia/CrystalMk. llvm-svn: 310348	2017-08-08 11:27:12 +00:00
Tobias Grosser	327e9ecb0d	[ScheduleOptimizer] Make matmul pattern detection work with delicm output In certain cases delicm might decide to not leave the original array write in the loop body, but to remove it and instead leave a transformed phi node as write access. This commit teached the matmul pattern detection to order the memory accesses according to when the access actually happens and use this information to detect the new pattern. This makes pattern based matmul optimization work for 2mm and 3mm in polybench 4 after polly-position=before-vectorizer has been enabled. llvm-svn: 310338	2017-08-08 06:15:15 +00:00
Tobias Grosser	736c44c848	[test] Add some missing options that become necessary after the recent default changes llvm-svn: 310315	2017-08-07 22:10:23 +00:00
Tobias Grosser	a98081c9f5	[test] Add one more test case for the previous commit llvm-svn: 310312	2017-08-07 22:02:06 +00:00
Tobias Grosser	2ef378120d	[ZoneAlgo] Allow two writes that write identical values into same array slot Two write statements which write into the very same array slot generally are conflicting. However, in case the value that is written is identical, this does not cause any problem. Hence, allow such write pairs in this specific situation. llvm-svn: 310311	2017-08-07 22:01:29 +00:00
Andreas Simbuerger	81fb6b3e40	[Polly] Fully-Indexed static expansion This commit implements the initial version of fully-indexed static expansion. ``` for(int i = 0; i<Ni; i++) for(int j = 0; j<Ni; j++) S: B[j] = j; T: A[i] = B[i] ``` After the pass, we want this : ``` for(int i = 0; i<Ni; i++) for(int j = 0; j<Ni; j++) S: B[i][j] = j; T: A[i] = B[i][i] ``` For now we bail (fail) in the following cases: - Scalar access - Multiple writes per SAI - MayWrite Access - Expansion that leads to an access to the original array Furthermore: We still miss checks for escaping references to the array base pointers. A future commit will add the missing escape-checks to stay correct in those cases. The expansion is still locked behind a CLI-Option and should not yet be used. Patch contributed by: Nicholas Bonfante <bonfante.nicolas@gmail.com> Reviewers: simbuerg, Meinersbur, bollu Reviewed By: Meinersbur Subscribers: mgorny, llvm-commits, pollydev Differential Revision: https://reviews.llvm.org/D34982 llvm-svn: 310304	2017-08-07 20:54:20 +00:00
Michael Kruse	70af4f579d	[ForwardOpTree] Use known array content analysis to forward load instructions. This is an addition to the -polly-optree pass that reuses the array content analysis from DeLICM to find array elements that contain the same value as the value loaded when the target statement instance is executed. The analysis is now enabled by default. The known content analysis could also be used to rematerialize any llvm::Value that was written to some array element, but currently only loads are forwarded. Differential Revision: https://reviews.llvm.org/D36380 llvm-svn: 310279	2017-08-07 18:40:29 +00:00
Tobias Grosser	aabfbfa5fc	Add missing 'REQUIRES: pollyacc' line llvm-svn: 310197	2017-08-06 11:21:09 +00:00
Tobias Grosser	b99c11710c	[GPGPU] Make sure managed arrays are prepared at the beginning of the scop Summary: This resolves some "instruction does not dominate use" errors, as we used to prepare the arrays at the location of the first kernel, which not necessarily dominated all other kernel calls. Reviewers: Meinersbur, bollu, singam-sanjay Subscribers: nemanjai, pollydev, llvm-commits, kbarton Differential Revision: https://reviews.llvm.org/D36372 llvm-svn: 310196	2017-08-06 11:10:38 +00:00
Tobias Grosser	5b307cdb8a	[GPGPU] Rename all, not only the first libdevice function llvm-svn: 310194	2017-08-06 03:04:15 +00:00
Siddharth Bhat	e53c924b0f	[Polly] [PPCGCodeGeneration] Deal with loops outside the Scop correctly in PPCGCodeGeneration. A Scop with a loop outside it is not handled currently by PPCGCodeGeneration. The test case is such that the Scop has only one inner loop that is detected. This currently breaks codegen. The fix is to reuse the existing mechanism in `IslNodeBuilder` within `GPUNodeBuilder. Differential Revision: https://reviews.llvm.org/D36290 llvm-svn: 310193	2017-08-06 02:39:05 +00:00
Tobias Grosser	c1cfe0a828	Add missing REQUIRES line llvm-svn: 309943	2017-08-03 14:46:53 +00:00
Tobias Grosser	b5563c6817	Make sure that all parameter dimensions are set in schedule Summary: In case the option -polly-ignore-parameter-bounds is set, not all parameters will be added to context and domains. This is useful to keep the size of the sets and maps we work with small. Unfortunately, for AST generation it is necessary to ensure all parameters are part of the schedule tree. Hence, we modify the GPGPU code generation to make sure this is the case. To obtain the necessary information we expose a new function Scop::getFullParamSpace(). We also make a couple of functions const to be able to make SCoP::getFullParamSpace() const. Reviewers: Meinersbur, bollu, gareevroman, efriedma, huihuiz, sebpop, simbuerg Subscribers: nemanjai, kbarton, pollydev, llvm-commits Tags: #polly Differential Revision: https://reviews.llvm.org/D36243 llvm-svn: 309939	2017-08-03 13:51:15 +00:00
Michael Kruse	291fd8074e	[test] Fix test case without Polly-ACC. llvm-svn: 309938	2017-08-03 13:44:31 +00:00
Siddharth Bhat	eadf76d34a	[PPCGCodeGeneration] Construct `isl_multi_pw_aff` of PPCGArray.bounds even when polly-ignore-parameter-bounds is turned on. When we have `-polly-ignore-parameter-bounds`, `Scop::Context` does not contain all the paramters present in the program. The construction of the `isl_multi_pw_aff` requires all the indivisual `pw_aff` to have the same parameter dimensions. To achieve this, we used to realign every `pw_aff` with `Scop::Context`. However, in conjunction with `-polly-ignore-parameter-bounds`, this is now incorrect, since `Scop::Context` does not contain all parameters. We set this up correctly by creating a space that has all the parameters used by all the `isl_pw_aff`. Then, we realign all `isl_pw_aff` to this space. llvm-svn: 309934	2017-08-03 12:09:33 +00:00
Singapuram Sanjay Srivallabh	188053af5e	Remove debug metadata from copied instruction to prevent GPUModule verification failure Summary: Remove debug metadata from instruction to be copied to prevent the source file's debug metadata being copied into GPUModule and eventually failing Module verification and ASM string codegeneration. When copying the instruction onto the Module meant for the GPU, debug metadata attached to an instruction causes all related metadata to be pulled into the Module, including the DICompileUnit, which is not listed in llvm.dbg.cu of the Module. This fails the verification of the Module and generation of the ASM string. The only debug metadata of the instruction, the DebugLoc, is unset by this patch. This patch reattempts https://reviews.llvm.org/D35630 by targeting only those instructions that are to end up in a Module meant for the GPU. Reviewers: grosser, bollu Reviewed By: grosser Subscribers: pollydev Tags: #polly Differential Revision: https://reviews.llvm.org/D36161 llvm-svn: 309822	2017-08-02 15:20:07 +00:00
Michael Kruse	bc88a78cb4	[Simplify] Rewrite redundant write detection algorithm. The previous algorithm was to search a writes and the sours of its value operand, and see whether the write just stores the same read value back, which includes a search whether there is another write access between them. This is O(n^2) in the max number of accesses in a statement (+ the complexity of isl comparing the access functions). The new algorithm is more similar to the one used for searching for overwrites and coalescable writes. It scans over all accesses in order of execution while tracking which array elements still have the same value since it was read. This is O(n), not counting the complexity within isl. It should be more reliable than trying to catch all non-conforming cases in the previous approach. It is also less code. We now also support if the write is a partial write of the read's domain, and to some extent non-affine subregions. Differential Revision: https://reviews.llvm.org/D36137 llvm-svn: 309734	2017-08-01 20:01:34 +00:00
Michael Kruse	693ef99935	[Simplify] Improve scalability. With a lot of reads and writes to the same array in a statement, some isl sets that capture the state between access can become complex such that isl takes more considerable time and memory for operations on them. The problems identified were: - is_subset() takes considerable time with many disjoints in the arguments. We limit the number of disjoints to 4, any additional information is thrown away. - subtract() can lead to many disjoints. We instead assume that any array element is possibly accessed, which removes all disjoints. - subtract_domain() may lead to considerable processing, even if all elements are are to be removed. Instead, we remove determine and remove the affected spaces manually. No behaviour is changed. llvm-svn: 309728	2017-08-01 19:39:11 +00:00
Siddharth Bhat	1ec9cba4e3	[NFC] Add 'REQUIRES: pollyacc' on 'test/GPGPU/invariant-load-hoisting-of-array.ll' - Should fix broken build due to `r309681`. llvm-svn: 309686	2017-08-01 14:52:18 +00:00
Siddharth Bhat	edf9581e4c	[PPCGCodeGeneration] Correct usage of llvm::Value with getLatestValue. It is possible that the `HostPtr` that coresponds to an array could be invariant load hoisted. Make sure we use the invariant load hoisted value by using `IslNodeBuilder::getLatestValue`. Differential Revision: https://reviews.llvm.org/D36001 llvm-svn: 309681	2017-08-01 14:26:39 +00:00
Michael Kruse	9f6e41cdba	[ForwardOpTree] Support synthesizable values. This allows -polly-optree to move instructions that depend on synthesizable values. The difficulty for synthesizable values is that their value depends on the location. When it is moved over a loop header, and the SCEV expression depends on the loop induction variable (SCEVAddRecExpr), it would use the current induction variable instead of the last one. At the moment we cannot forward PHI nodes such that crossing the header of loops referenced by SCEVAddRecExpr is not possible (assuming the loop header has at least two incoming blocks: for entering the loop and the backedge, such any instruction to be forwarded must have a phi between use and definition). A remaining issue is when the forwarded value is used after the loop, but is only synthesizable inside the loop. This happens e.g. if ScalarEvolution is unable to determine the number of loop iterations or the initial loop value. We do not forward in this situation. Differential Revision: https://reviews.llvm.org/D36102 llvm-svn: 309609	2017-07-31 19:46:21 +00:00
Michael Kruse	57cc92b790	[Simplify] Remove all kinds of redundant scalar writes. In addition to array and PHI writes, also allow scalar value writes. The only kind of write not allowed are writes by functions (including memcpy/memmove/memset). llvm-svn: 309582	2017-07-31 17:04:55 +00:00
Tobias Grosser	8fc6cdfb1c	[GPGPU] Add support for NVIDIA libdevice Summary: This allows us to map functions such as exp, expf, expl, for which no LLVM intrinsics exist. Instead, we link to NVIDIA's libdevice which provides high-performance implementations of a wide range of (math) functions. We currently link only a small subset, the exp, cos and copysign functions. Other functions will be enabled as needed. Reviewers: bollu, singam-sanjay Reviewed By: bollu Subscribers: tstellar, tra, nemanjai, pollydev, mgorny, llvm-commits, kbarton Tags: #polly Differential Revision: https://reviews.llvm.org/D35703 llvm-svn: 309560	2017-07-31 14:03:16 +00:00
Tobias Grosser	39977e4e76	Revert "Remove Debug metadata from copied instruction to prevent Module verification failure" This reverts commit r309490 as it triggers on our AOSP buildbut error messages of the form: inlinable function call in a function with debug info must have a !dbg location llvm-svn: 309556	2017-07-31 11:43:38 +00:00
Singapuram Sanjay Srivallabh	cf9a813368	Remove Debug metadata from copied instruction to prevent Module verification failure Summary: Remove debug metadata from instruction to be copied to prevent the source file's debug metadata being copied into GPUModule and eventually failing Module verification and ASM string codegeneration. When copying the instruction onto the Module meant for the GPU, debug metadata attached to an instruction causes all related metadata to be pulled into the Module, including the DICompileUnit, which is not listed in llvm.dbg.cu of the Module. This fails the verification of the Module and generation of the ASM string. The only debug metadata of the instruction, the DebugLoc, is unset by this patch. Reviewers: grosser, bollu, Meinersbur Reviewed By: grosser, bollu Subscribers: pollydev Tags: #polly Differential Revision: https://reviews.llvm.org/D35630 llvm-svn: 309490	2017-07-29 18:03:49 +00:00
Michael Kruse	ce9617f4fe	[Simplify] Implement write accesses coalescing. Write coalescing combines write accesses that - Write the same llvm::Value. - Write to the same array. - Unless they do not write anything in a statement instance (partial writes), write to the same element. - There is no other access between them that accesses the same element. This is particularly useful after DeLICM, which leaves partial writes to disjoint domains. Differential Revision: https://reviews.llvm.org/D36010 llvm-svn: 309489	2017-07-29 16:21:16 +00:00
Michael Kruse	4335c3992a	[test] Add test case for -polly-simplify. NFC. llvm-svn: 309458	2017-07-29 00:06:06 +00:00
Michael Kruse	8e41d2baab	[Simplify] Do not remove dependencies of phis within region stmts. These were wrongly assumed to be phi nodes that require MemoryKind::PHI accesses. llvm-svn: 309454	2017-07-28 23:22:32 +00:00
Adrian Prantl	99c4a5fb8e	Remove offset parameter from llvm.dbg.value intrinsics in testcase llvm-svn: 309433	2017-07-28 21:08:53 +00:00
Michael Kruse	c99209b4b2	[test] Fix typo in filename. NFC. llvm-svn: 309403	2017-07-28 16:57:56 +00:00
Michael Kruse	6c8f91b908	[Simplify] Fix typo in statistics output. NFC. llvm-svn: 309402	2017-07-28 16:57:51 +00:00
Michael Kruse	34a77780c5	[Simplify] Remove empty partial accesses first. NFC. So follow-up cleanup do not need special handling for such accesses. llvm-svn: 309401	2017-07-28 16:57:45 +00:00
Siddharth Bhat	0a1177b58e	[ScopDetect] add `-polly-ignore-func` flag to ignore functions by name. Ignore all functions whose name match a regex. Useful because creating a regex that does not match a string is somewhat hard. Example: https://stackoverflow.com/questions/1240275/how-to-negate-specific-word-in-regex llvm-svn: 309377	2017-07-28 11:47:24 +00:00
Tobias Grosser	25271b91b2	[GPGPU] Do not require the Scop::Context to have information about all parameters llvm-svn: 309368	2017-07-28 06:49:44 +00:00
Tobias Grosser	30caae6d23	[GPGPU] Fix compilation issue with latest CUDA upgrade to i128 llvm-svn: 309366	2017-07-28 06:38:49 +00:00
Michael Kruse	8a8aca4299	[Simplify] Count PHINodes in simplifiable exit nodes as escaping use. After region exit simplification, the incoming block of a phi node in the SCoP region's exit block lands outside of the region. Since we treat SCoPs as if this already happened, we need to account for that when looking for outside uses of scalars (i.e. escaping scalars). llvm-svn: 309271	2017-07-27 14:09:31 +00:00
Michael Kruse	95b39da8ae	[Simplify] Fix invalid removal write for escaping values. A PHI node's incoming block is the user of its operand, not the PHI's parent. Assuming the PHINode's parent being the user lead to the removal of a MemoryAccesses because its use was assumed to be inside of the SCoP. llvm-svn: 309164	2017-07-26 19:58:15 +00:00
Michael Kruse	11ed062258	[SCEVValidator] Loop exit values of loops before the SCoP are synthesizable. In the following loop: int i; for (i = 0; i < func(); i+=1) ; SCoP: for (int j = 0; j<n; j+=1) S(i, j) The value i is synthesizable in the SCoP that includes only the j-loop. This is because i is fixed within the SCoP, it is irrelevant whether it originates from another loop. This fixes a strange case where a PHI was synthesiable in a SCoP, but not its incoming value, triggering an assertion. This should fix MultiSource/Applications/sgefa/sgefa of the perf-x86_64-penryn-O3-polly-before-vectorizer-unprofitable buildbot. llvm-svn: 309109	2017-07-26 13:05:45 +00:00
Philip Pfaffe	85cc5687df	[IslAst] Untangle IslAst lit-testcases from specifics of the legacy-PM Summary: This consists instances of two changes: - Accept any order of checks for a specific loop form, that appear in different order in the new vs legacy-PM. - Remove checks for specific regions. Reviewers: grosser Reviewed By: grosser Subscribers: pollydev, llvm-commits Tags: #polly Differential Revision: https://reviews.llvm.org/D35837 llvm-svn: 308976	2017-07-25 15:07:42 +00:00
Michael Kruse	b6317007b4	[ScopInfo] Fix assertion for PHIs not in a region stmts entry. A PHI node within a region statement is legal, but does not have a MemoryKind::PHI access. llvm-svn: 308973	2017-07-25 13:28:39 +00:00
Siddharth Bhat	43f178bbc9	[PPCGCodeGeneration] Skip arrays with empty extent. Invariant load hoisted scalars, and arrays whose size we can statically compute to be 0 do not need to be allocated as arrays. Invariant load hoisted scalars are sent to the kernel directly as parameters. Earlier, we used to allocate `0` bytes of memory for these because our computation of size from `PPCGCodeGeneration::getArraySize` would result in `0`. Now, since we don't invariant loads as arrays in PPCGCodeGeneration, this problem does not occur anymore. Differential Revision: https://reviews.llvm.org/D35795 llvm-svn: 308971	2017-07-25 12:35:36 +00:00
Michael Kruse	07e8c36dc7	[ForwardOpTree] Support read-only value uses. Read-only values (values defined before the SCoP) require special handing with -polly-analyze-read-only-scalars=true (which is the default). If active, each use of a value requires a read access. When a copied value uses a read-only value, we must also ensure that such a MemoryAccess is available or is created. Differential Revision: https://reviews.llvm.org/D35764 llvm-svn: 308876	2017-07-24 12:43:27 +00:00
Siddharth Bhat	e2699b572e	[Polly] [NFC] [ScopDetection] Make `polly-only-func` perform regex scop name match. Summary: - We were using `.count` in `StringRef`, which matches substrings. - We may want to use this for equality as well. - Generalise this, so allow regexes as a parameter to `polly-only-func`. Differential Revision: https://reviews.llvm.org/D35728 llvm-svn: 308875	2017-07-24 12:40:52 +00:00
Michael Kruse	ab8f0d57df	[Simplify] Remove partial write accesses with empty domain. If the access relation's domain is empty, the access will never be executed. We can just remove it. We only remove write accesses. Partial read accesses are not yet supported and instructions in the statement might require the llvm::Value holding the read's result to be defined. llvm-svn: 308830	2017-07-22 20:33:09 +00:00
Michael Kruse	e5f4706a55	[ForwardOpTree] Support hoisted invariant loads. Hoisted loads can be trivially supported because there are no MemoryAccess to be modified, the loaded value is just available at code generation. llvm-svn: 308826	2017-07-22 14:30:02 +00:00
Michael Kruse	a6b2de3b59	[ForwardOpTree] Introduce the -polly-optree pass. This pass 'forwards' operand trees into statements that use them in order to avoid scalar dependencies. This minimal implementation handles only the case of speculatable instructions. We will successively add support for: - Hoisted loads - Read-only values - Synthesizable values - Loads - PHIs - Forwarding only parts of the tree Differential Revision: https://reviews.llvm.org/D35754 llvm-svn: 308825	2017-07-22 14:02:47 +00:00
Philip Pfaffe	8f6c48e2aa	Untangle ScopInfo lit-testcases from specifics of the legacy-PM Summary: For the ScopInfo lit testsuite, this patch removes some dependences on output behaviour of the legacy PM. In most cases, these tests checked the tool output for labels created by the pass printer in the legacy PM. This doesn't work for the new PM anymore. Untangling the testcases is the first step to porting the testsuite for the new PM infrastructure. Reviewers: grosser, Meinersbur, bollu Reviewed By: grosser Subscribers: llvm-commits, pollydev Tags: #polly Differential Revision: https://reviews.llvm.org/D35727 llvm-svn: 308754	2017-07-21 16:47:36 +00:00
Philipp Schaad	2f3073b5cb	[Polly][GPGPU] Added SPIR Code Generation and Corresponding Runtime Support for Intel Summary: Added SPIR Code Generation to the PPCG Code Generator. This can be invoked using the polly-gpu-arch flag value 'spir32' or 'spir64' for 32 and 64 bit code respectively. In addition to that, runtime support has been added to execute said SPIR code on Intel GPU's, where the system is equipped with Intel's open source driver Beignet (development version). This requires the cmake flag 'USE_INTEL_OCL' to be turned on, and the polly-gpu-runtime flag value to be 'libopencl'. The transformation of LLVM IR to SPIR is currently quite a hack, consisting in part of regex string transformations. Has been tested (working) with Polybench 3.2 on an Intel i7-5500U (integrated graphics chip). Reviewers: bollu, grosser, Meinersbur, singam-sanjay Reviewed By: grosser, singam-sanjay Subscribers: pollydev, nemanjai, mgorny, Anastasia, kbarton Tags: #polly Differential Revision: https://reviews.llvm.org/D35185 llvm-svn: 308751	2017-07-21 16:11:06 +00:00
Tobias Grosser	1eeedf4829	[IslNodeBuilder] Relax complexity check in invariant loads and run it early When performing invariant load hoisting we check that invariant load expressions are not too complex. Up to this commit, we performed this check by counting the sum of dimensions in the access range as a very simple heuristic. This heuristic is a little too conservative, as it prevents hoisting for any scops with a very large number of parameters. Hence, we update the heuristic to only count existentially quantified dimensions and set dimensions. We expect this to still detect the problematic expressions in h264 because of which this check was originally introduced. For some unknown reason, this complexity check was originally committed in IslNodeBuilder. It really belongs in ScopInfo, as there is no point in optimizing a program which we could have known earlier cannot be code generated. The benefit of running the check early is that we can avoid to even hoist checks that are expensive to code generate as invariant loads. This can be seen in the changed tests, where we now indeed detect the scop, but just not invariant load hoist the complicated access. We also improve the formatting of the code, document it, and use isl++ to simplify expressions. llvm-svn: 308659	2017-07-20 19:55:19 +00:00
Tobias Grosser	54491db687	Support fabs and copysign in Polly-ACC llvm-svn: 308649	2017-07-20 18:26:34 +00:00
Michael Kruse	22058c3fbb	[Simplify] Remove unused instructions and accesses. Use a mark-and-sweep algorithm to find and remove unused instructions and MemoryAccesses. This is useful in particular to remove scalar writes that are never used anywhere. A scalar write in a loop induces a write-after-write dependency that stops the loop iterations to be rescheduled. Such writes can be a result of previous transformations such as DeLICM and operand tree forwarding. It adds a new class VirtualInstruction that represents an instruction in a particular statement. At the moment an instruction can only belong to the statement that represents a BasicBlock. In the future, instructions can be in one of multiple statements representing a BasicBlock (Nandini's work), in different statements than its BasicBlock would indicate, and even multiple statements at once (by forwarding operand trees). It also integrates nicely with the VirtualUse class. ScopStmt::contains(Instruction*) currently uses the instruction's parent BasicBlock to check whether it contains the instruction. It will need to check the actual statement list when one of the aforementioned features become possible. Differential Revision: https://reviews.llvm.org/D35656 llvm-svn: 308626	2017-07-20 16:21:55 +00:00
Siddharth Bhat	9e3db2b756	[PPCGCodeGen] [3/3] Update PPCGCodeGen + tests to latest ppcg. This commit WILL COMPILE. 1. `PPCG` now uses `isl_multi_pw_aff` instead of an array of `pw_aff`. This needs us to adjust how we index array bounds and how we construct array bounds. 2. `PPCG` introduces two new kinds of nodes: `init_device` and `clear_device`. We should investigate what the correct way to handle these are. 3. `PPCG` has gotten smarter with its use of live range reordering, so some of the tests have a qualitative improvement. 4. `PPCG` changed its output style, so many test cases need to be updated to fit the new style for `polly-acc-dump-code` checks. Differential Revision: https://reviews.llvm.org/D35677 llvm-svn: 308625	2017-07-20 15:48:36 +00:00
Michael Kruse	0865585eab	[ScopInfo] Add support for wrap-around of integers in unsigned comparisons. This is one possible solution to implement wrap-arounds for integers in unsigned icmp operations. For example, store i32 -1, i32* %A_addr %0 = load i32, i32* %A_addr %1 = icmp ult i32 %0, 0 %1 should hold false, because under the assumption of unsigned integers, -1 should wrap around to 2^32-1. However, previously. it was assumed that the MSB (Most Significant Bit - aka the Sign bit) was never set for integers in unsigned operations. This patch modifies the buildConditionSets function in ScopInfo.cpp to give better information about the integers in these unsigned comparisons. Contributed-by: Annanay Agarwal <cs14btech11001@iith.ac.in> Differential Revision: https://reviews.llvm.org/D35464 llvm-svn: 308608	2017-07-20 12:37:02 +00:00
Philip Pfaffe	17b1ecfdc5	[CMake] Fix r307650: Readd missing dependency. The commit erroneously removed the dependency of the Polly tests on things like opt and FileCheck. Add that dependency back. llvm-svn: 308512	2017-07-19 19:20:58 +00:00
Roman Gareev	5abea0c97a	[FIX] Update test/ScheduleOptimizer/pattern-matching-based-opts_11.ll. llvm-svn: 308501	2017-07-19 18:01:51 +00:00
Roman Gareev	6531df41ae	[FIX] Fix pattern-matching-based-opts_11.ll. llvm-svn: 308499	2017-07-19 17:33:42 +00:00
Roman Gareev	750374181b	Make the pattern matching work with modified memory accesses Some optimizations (e.g., DeLICM) can modify memory accesses (e.g., change their MemoryKind). Consequently, the pattern matching should take it into the account. Reviewed-by: Tobias Grosser <tobias@grosser.es>, Michael Kruse <llvm@meinersbur.de> Differential Revision: https://reviews.llvm.org/D33138 llvm-svn: 308494	2017-07-19 16:59:06 +00:00
Michael Kruse	bb7d22a31a	[Test] Do not pipe binary data to FileCheck. llvm-svn: 308437	2017-07-19 11:12:16 +00:00
Eli Friedman	e737fc120e	[Polly] [OptDiag] Updating Polly Diagnostics Remarks Utilizing newer LLVM diagnostic remark API in order to enable use of opt-viewer tool. Polly Diagnostic Remarks also now appear in YAML remark file. In this patch, I've added the OptimizationRemarkEmitter into certain classes where remarks are being emitted and update the remark emit calls itself. I also provide each remark a BasicBlock or Instruction from where it is being called, in order to compute the hotness of the remark. Patch by Tarun Rajendran! Differential Revision: https://reviews.llvm.org/D35399 llvm-svn: 308233	2017-07-17 23:58:33 +00:00
Tobias Grosser	4556c9b8fe	[ScopInfo] Simplify new access functions under domain context Summary: We do not keep domain constraints on access functions when building the scop. Hence, for consistency reasons, it makes also sense to not include them when storing a new access function. This change results in simpler access functions that make output easier to read. This patch also helps to make DeLICMed memory accesses to be understood by our matrix multiplication pattern matching pass. Further changes to the matrix multiplication pattern matching are needed for this to work, so the corresponding test case will be added in a future commit. Reviewers: Meinersbur, bollu, gareevroman, efriedma, huihuiz, sebpop, simbuerg Subscribers: pollydev, llvm-commits Tags: #polly Differential Revision: https://reviews.llvm.org/D35237 llvm-svn: 308215	2017-07-17 20:47:10 +00:00
Siddharth Bhat	233d717ec1	[PPCGCodeGeneration] Generate invariant loads before trying to generate IR. - We should call `preloadInvariantLoads` to make sure that code is generated for invariant loads in the kernel. Differential Revision: https://reviews.llvm.org/D35410 llvm-svn: 308187	2017-07-17 15:57:01 +00:00
Tobias Grosser	a3aa423fc3	[ScopDetection] If a loop is not part of a scop, none of it backedges can be This patch makes sure that in case a loop is not fully contained within a region that later forms a SCoP, none of the loop backedges are allowed to be part of the region. We currently do not support the situation where only some of a loops backedges are part of a scop. Today, this can break both scop modeling and code generation. One such breaking test case is for example test/ScopDetectionDiagnostics/loop_partially_in_scop-2.ll, where we totally forgot to code generate some of the backedges. Fortunately, it is commonly not necessary to support these partial loops, it is way more common that either no backedge is included in a region or all loop backedge are included. This fixes a recent miscompile in MultiSource/Benchmarks/MiBench/consumer-typeset which was exposed after r306477. llvm-svn: 308113	2017-07-15 22:42:17 +00:00
Siddharth Bhat	03346c2701	[PPCGCodeGeneration] Fix runtime check adjustments since they make assumptions about BB layout. - There is a conditional branch that is used to switch between the old and new versions of the code. - If we detect that the build was unsuccessful, `PPCGCodeGeneration` will change the runtime check to be always set to false. - To actually reach this runtime check instruction, `PPCGCodeGeneration` was using assumptions about the layout of the BBs. - However, invariant load hoisting violates this assumption by inserting an extra basic block in the middle. - Fix the assumption on the layout by having `createScopConditionally` return the conditional branch instruction. - Use this reference to set to always-false. llvm-svn: 308010	2017-07-14 10:00:25 +00:00
Siddharth Bhat	a1b2086a33	[Invariant Loads] Do not consider invariant loads to have dependences. We need to relax constraints on invariant loads so that they do not create fake RAW dependences. So, we do not consider invariant loads as scalar dependences in a region. During these changes, it turned out that we do not consider `llvm::Value` replacements correctly within `PPCGCodeGeneration` and `ISLNodeBuilder`. The replacements dictated by `ValueMap` were not being followed in all places. This was fixed in this commit. There is no clean way to decouple this change because this bug only seems to arise when the relaxed version of invariant load hoisting was enabled. Differential Revision: https://reviews.llvm.org/D35120 llvm-svn: 307907	2017-07-13 12:18:56 +00:00
Singapuram Sanjay Srivallabh	1abd9ffa37	[PPCGCodeGen] Differentiate kernels based on their parent Scop Summary: Add a sequence number that identifies a ptx_kernel's parent Scop within a function to it's name to differentiate it from other kernels produced from the same function, yet different Scops. Kernels produced from different Scops can end up having the same name. Consider a function with 2 Scops and each Scop being able to produce just one kernel. Both of these kernels have the name "kernel_0". This can lead to the wrong kernel being launched when the runtime picks a kernel from its cache based on the name alone. This patch supplements D33985, by differentiating kernels across Scops as well. Previously (even before D33985) while profiling kernels generated through JIT e.g. Julia, [[ https://groups.google.com/d/msg/polly-dev/J1j587H3-Qw/mR-jfL16BgAJ \| kernels associated with different functions, and even different SCoPs within a function, would be grouped together due to the common name ]]. This patch prevents this grouping and the kernels are reported separately. Reviewers: grosser, bollu Reviewed By: grosser Subscribers: mehdi_amini, nemanjai, pollydev, kbarton Tags: #polly Differential Revision: https://reviews.llvm.org/D35176 llvm-svn: 307814	2017-07-12 16:46:19 +00:00
Siddharth Bhat	87fa280831	[Polly] [Tests] Update `lit.cfg` uses of `lit.util.capture` to `subprocess.check_output` - `lit.util.capture` was removed in `r306625`. - Replace `lit.util.capture` to `subprocess.check_output` as LLVM did. - LLVM revision of this change: `https://reviews.llvm.org/D35088`. Differential Revision: https://reviews.llvm.org/D35255 llvm-svn: 307765	2017-07-12 09:42:05 +00:00
Tobias Grosser	bed2ca6eac	[Simplify] Also remove redundant writes which originally came from PHI nodes llvm-svn: 307660	2017-07-11 14:29:39 +00:00
Philip Pfaffe	54df93d60e	[Polly][CMake] Skip unit-tests in lit if gtest is not available Summary: There is a bug in the current lit configurations for the unittests. If gtest is not available, the site-config for the unit tests won't be generated. Because lit recurses through the test directory, the lit configuration for the unit tests will be discovered nevertheless, leading to a fatal error in lit. This patch semi-gracefully skips the unittests if gtest is not available. As a result, running lit now prints this: `warning: test suite 'Polly-Unit' contained no test`. If people think that this is too annoying, the alternative would be to pick apart the test directory, so that the lit testsuite discovery will always only find one configuration. In fact, both of these things could be combined. While it's certainly nice that running a single lit command runs all the tests, I suppose people use the `check-polly` make target over lit most of the time, so the difference might not be noticed. Reviewers: Meinersbur, grosser Reviewed By: grosser Subscribers: mgorny, bollu, pollydev, llvm-commits Tags: #polly Differential Revision: https://reviews.llvm.org/D34053 llvm-svn: 307651	2017-07-11 11:37:35 +00:00
Philip Pfaffe	d99c406e3d	[Polly][CMake] Use the CMake Package instead of llvm-config in out-of-tree builds Summary: As of now, Polly uses llvm-config to set up LLVM dependencies in an out-of-tree build. This is problematic for two reasons: 1) Right now, in-tree and out-of-tree builds in fact do different things. E.g., in an in-tree build, libPolly depends on a handful of LLVM libraries, while in an out-of-tree build it depends on all of them. This means that we often need to treat both paths seperately. 2) I'm specifically unhappy with the way libPolly is linked right now, because it just blindly links against all the LLVM libs. That doesn't make a lot of sense. For instance, one of these libs is LLVMTableGen, which contains a command line definition of a -o option. This means that I can not link an out-of-tree libPolly into a tool which might want to offer a -o option as well. This patch (mostly) drop the use of llvm-config in favor of LLVMs exported cmake package. However, building Polly with unittests requires access to the gtest sources (in the LLVM source tree). If we're building against an LLVM installation, this source tree is unavailable and must specified. I'm using llvm-config to provide a default in this case. Reviewers: Meinersbur, grosser Reviewed By: grosser Subscribers: tstellar, bollu, chapuni, mgorny, pollydev, llvm-commits Differential Revision: https://reviews.llvm.org/D33299 llvm-svn: 307650	2017-07-11 11:24:25 +00:00
Tobias Grosser	d6bea86029	[tests] Add import-jscop-dir to lit.site.cfg.in For the previous commit I accidentally added this change to lit.site.cfg, which is autogenerated and was consequently not part of the previous commit. llvm-svn: 307648	2017-07-11 11:07:01 +00:00
Tobias Grosser	e40c0fe3f8	[tests] Set -polly-import-jscop-dir=%S always This simplifies the test cases. llvm-svn: 307645	2017-07-11 10:39:01 +00:00
Tobias Grosser	6561f78b64	[Simplify] Add test case which we currently miss llvm-svn: 307643	2017-07-11 10:30:45 +00:00
Tobias Grosser	153a508349	[IslAst] Print memory accesses in AST dump When providing the option "-polly-ast-print-accesses" Polly also prints the memory accesses that are generated: #pragma known-parallel for (int c0 = 0; c0 <= 1023; c0 += 4) #pragma simd for (int c1 = c0; c1 <= c0 + 3; c1 += 1) Stmt_for_body( /* read / &MemRef_B[0] / write */ MemRef_A[c1] ); This makes writing and debugging memory layout transformations easier. Based on a patch contributed by Thomas Lang (ETH Zurich) llvm-svn: 307579	2017-07-10 20:13:06 +00:00
Siddharth Bhat	7cd1f53ce7	[NFC] [PPCGCodeGeneration] Extend `invariant-load-hoisting-with-variable-upper-bound` test case. - Check that we have invariant accesses. - Use `-polly-use-llvm-names` for better names in the test. - Rename test function to `f` for brevity. llvm-svn: 307401	2017-07-07 14:02:27 +00:00
Siddharth Bhat	1fc7b76a2b	[NFC] [PPCGCodeGeneration] Add test for simple invariant load hoisting. - This already works, but add this to ensure that there is no regressions when I expand the invariant load hoisting ability of `PPCGCodeGeneration`. llvm-svn: 307398	2017-07-07 13:44:22 +00:00
Tobias Grosser	41f02a9960	Make create_ll work with latest LLVM [NFC] - Instead of running with -O0, we enable the highest optimization level, but then disable optimizations. This ensures that possibly important metadata is still emitted. - Update the code for attribute removal to work with latest LLVM - Do not cut an arbitrary number of lines from the LL file. It is undocumented why this was needed at the first place, and such a feature is likely to break with trivial IR changes that may come in the future. llvm-svn: 307355	2017-07-07 04:20:55 +00:00
Siddharth Bhat	761e5b9310	[Polly] [PPCGCodeGeneration] Teach `must_kills` to kill scalars that are local to the scop. - By definition, we can pass something as a `kill` to PPCG if we know that no data can flow across a kill. - This is useful for more complex examples where we have scalars that are local to a scop. - If the local is only used within a scop, we are free to kill it. Differential Revision: https://reviews.llvm.org/D35045 llvm-svn: 307260	2017-07-06 13:42:42 +00:00
Singapuram Sanjay Srivallabh	79f13b9a80	Prefix the name of the calling host function in the name of callee GPU kernel Summary: Provide more context to the name of a GPU kernel by prefixing its name with the host function that calls it. E.g. The first kernel called by `gemm` would be `FUNC_gemm_KERNEL_0`. Kernels currently follow the "kernel_#" (# = 0,1,2,3,...) nomenclature. This patch makes it easier to map host caller and device callee, especially when there are many kernels produced by Polly-ACC. Reviewers: grosser, Meinersbur, bollu, philip.pfaffe, kbarton! Reviewed By: grosser Subscribers: nemanjai, pollydev Tags: #polly Differential Revision: https://reviews.llvm.org/D33985 llvm-svn: 307173	2017-07-05 16:48:21 +00:00
Siddharth Bhat	de0a534c75	[NFC] Fix breaking build by adding REQUIRES: pollyacc llvm-svn: 307165	2017-07-05 15:20:28 +00:00
Siddharth Bhat	a82f2d264a	[PPCGCodeGeneration] Teach Polly to start using live range reordering. Polly did not use PPCG's live range reordering feature. Teach PPCGCodeGeneration to use this. Documentation on this is sparse, so much of the code is conservative. We currently kill all phi nodes in a Scop by appending them to the must_kill map we pass to PPCG. I do not have a proof of correctness, but it seems to be intuitively correct. We also do not handle `array_order`, which, quoting PPCG, is: PPCG/gpu.h: "Order dependences on non-scalars." It seems to consist of RAW dependences between arrays. We need to pass this information for more complex privatization cases. Differential Revision: https://reviews.llvm.org/D34941 llvm-svn: 307163	2017-07-05 14:57:04 +00:00
Tobias Grosser	5e41458985	Bump isl to isl-0.18-768-g033b61ae Summary: This is a general maintenance update Reviewers: grosser Subscribers: srhines, fedor.sergeev, pollydev, llvm-commits Contributed-by: Maximilian Falkenstein <falkensm@student.ethz.ch> Differential Revision: https://reviews.llvm.org/D34903 llvm-svn: 307090	2017-07-04 15:54:11 +00:00
Michael Kruse	b738ffa845	Heap allocation for new arrays. This patch aims to implement the option of allocating new arrays created by polly on heap instead of stack. To enable this option, a key named 'allocation' must be written in the imported json file with the value 'heap'. We need such a feature because in a next iteration, we will implement a mechanism of maximal static expansion which will need a way to allocate arrays on heap. Indeed, the expansion is very costly in terms of memory and doing the allocation on stack is not worth considering. The malloc and the free are added respectively at polly.start and polly.exiting such that there is no use-after-free (for instance in case of Scop in a loop) and such that all memory cells allocated with a malloc are free'd when we don't need them anymore. We also add : - In the class ScopArrayInfo, we add a boolean as member called IsOnHeap which represents the fact that the array in allocated on heap or not. - A new branch in the method allocateNewArrays in the ISLNodeBuilder for the case of heap allocation. allocateNewArrays now takes a BBPair containing polly.start and polly.exiting. allocateNewArrays takes this two blocks and add the malloc and free calls respectively to polly.start and polly.exiting. - As IntPtrTy for the malloc call, we use the DataLayout one. To do that, we have modified : - createScopArrayInfo and getOrCreateScopArrayInfo such that it returns a non-const SAI, in order to be able to call setIsOnHeap in the JSONImporter. - executeScopConditionnaly such that it return both start block and end block of the scop, because we need this two blocs to be able to add the malloc and the free calls at the right position. Differential Revision: https://reviews.llvm.org/D33688 llvm-svn: 306540	2017-06-28 13:02:43 +00:00
Andreas Simbuerger	6d08ec7233	[JSONImport] Check, if the size of an imported array is positive llvm-svn: 306479	2017-06-27 22:30:44 +00:00
Andreas Simbuerger	4e6eed8566	[FIX] Add %loadPolly to test This test fails, if polly is not linked into LLVM's tools. Our lit site-config already deals with this by not adding the -load option, if polly is linked into LLVM's tools. llvm-svn: 306395	2017-06-27 10:47:55 +00:00
Siddharth Bhat	65d7f72f2c	[PPCGCodeGeneration] Add flag to allow polly to fail in GPU kernel fails. - This is useful for debugging GPU code. llvm-svn: 306290	2017-06-26 14:56:56 +00:00
Siddharth Bhat	f291c8d510	[PPCGCodeGeneration] Allow intrinsics within kernels. - In D33414, if any function call was found within a kernel, we would bail out. - This is an over-approximation. This patch changes this by allowing the `llvm.sqrt.*` family of intrinsics. - This introduces an additional step when creating a separate llvm::Module for a kernel (GPUModule). We now copy function declarations from the original module to new module. - We also populate IslNodeBuilder::ValueMap so it replaces the function references to the old module to the ones in the new module (GPUModule). Differential Revision: https://reviews.llvm.org/D34145 llvm-svn: 306284	2017-06-26 13:12:06 +00:00
Tobias Grosser	2927cb7520	[tests] Add forgotten pollyacc REQUIRES line llvm-svn: 306273	2017-06-26 06:07:40 +00:00
Siddharth Bhat	a12f807f33	[PPCGCodeGeneration] Enable GPU code generation with invariant loads. The condition that disallowed code generation in PPCGCodeGeneration with invariant loads is not required. I haven't been able to construct a counterexample where this generates invalid code. Differential Revision: https://reviews.llvm.org/D34604 llvm-svn: 306245	2017-06-25 14:48:24 +00:00
Tobias Grosser	1b9d1bcc6d	[ScopInfo] Bound the number of array disjuncts in run-time bounds checks This reduces the compilation time of one reduced test case from Android from 16 seconds to 100 mseconds (we bail out), without negatively impacting any other test case we currently have. We still saw occasionally compilation timeouts on the AOSP buildbot. Hopefully, those will go away with this change. llvm-svn: 306235	2017-06-25 06:32:00 +00:00
Roman Gareev	c4a4d04717	[FIX] A small addition to r305675. llvm-svn: 306234	2017-06-25 06:30:11 +00:00
Eli Friedman	5e589ea4b1	[ScopInfo] Fix crash with sum of invariant load and AddRec. r303971 added an assertion that SCEV addition involving an AddRec and a SCEVUnknown must involve a dominance relation: either the SCEVUnknown value dominates the AddRec's loop, or the AddRec's loop header dominates the SCEVUnknown. This is generally fine for most usage of SCEV because it isn't possible to write an expression in IR which would violate it, but it's a bit inconvenient here for polly. To solve the issue, just avoid creating a SCEV expression which triggers the asssertion. I'm not really happy with this solution, but I don't have any better ideas. Fixes https://bugs.llvm.org/show_bug.cgi?id=33464. Differential Revision: https://reviews.llvm.org/D34259 llvm-svn: 305864	2017-06-20 22:53:02 +00:00
Michael Kruse	214deb7960	[CodeGen] Emit aliasing metadata for new arrays. Ensure that all array base pointers are assigned before generating aliasing metadata by allocating new arrays beforehand. Before this patch, getBasePtr() returned nullptr for new arrays because the arrays were created at a later point. Nullptr did not match to any array after the created array base pointers have been assigned and when the loads/stores are generated. llvm-svn: 305675	2017-06-19 10:19:29 +00:00
Eli Friedman	127e0cd21b	Don't check side effects for functions outside of SCoP In r304074 we introduce a patch to accept results from side effect free functions into SCEV modeling. This causes rejection of cases where the call is happening outside the SCoP. This patch checks if the call is outside the Region and treats the results as a parameter (SCEVType::PARAM) to the SCoP instead of returning SCEVType::INVALID. Patch by Sameer Abu Asal. llvm-svn: 305423	2017-06-14 22:43:28 +00:00
Siddharth Bhat	bccaea57c0	[Polly] [PPCGCodeGeneration] Skip Scops which contain function pointers. In `PPCGCodeGeneration`, we try to take the references of every `Value` that is used within a Scop to offload to the kernel. This occurs in `GPUNodeBuilder::createLaunchParameters`. This breaks if one of the values is a function pointer, since one of these cases will trigger: 1. We try to to take the references of an intrinsic function, and this breaks at `verifyModule`, since it is illegal to take the reference of an intrinsic. 2. We manage to take the reference to a function, but this fails at `verifyModule` since the function will not be present in the module that is created in the kernel. 3. Even if `verifyModule` succeeds (which should not occur), we would then try to call a host function from the device, which is illegal runtime behaviour. So, we disable this entire range of possibilities by simply not allowing function references within a `Scop` which corresponds to a kernel. However, note that this is too conservative. We can allow intrinsics within kernels if the backend can lower the intrinsic correctly. For example, an intrinsic like `llvm.powi.*` can actually be lowered by the `NVPTX` backend. We will now gradually whitelist intrinsics which are known to be safe. Differential Revision: https://reviews.llvm.org/D33414 llvm-svn: 305185	2017-06-12 11:41:09 +00:00
Siddharth Bhat	286c916dde	[Polly] [ScopDetection] Allow passing multiple functions to `-polly-only-func`. - This is useful to run optimisations on only certain functions. Differential Revision: https://reviews.llvm.org/D33990 llvm-svn: 305060	2017-06-09 08:23:40 +00:00
Michael Kruse	ad7a1805be	[Simplify] Use execution order of memory accesses. Iterate through memory accesses in execution order (first all implicit reads, then explicit accesses, then implicit writes). In the test case this caused an implicit load to be handled as if it was loaded after the write. That is, the value being written before it is available. This fixes llvm.org/PR33323 llvm-svn: 304810	2017-06-06 17:46:42 +00:00
Tobias Grosser	deefbced96	[Polly] [BlockGen] Support partial writes in regions Summary: The RegionGenerator traditionally kept a BlockMap that mapped from original basic blocks to newly generated basic blocks. With the introduction of partial writes such a 1:1 mapping is not possible any more, as a single basic block can be code generated into multiple basic blocks. Hence, depending on the use case we need to either use the first basic block or the last basic block. This is intended to address the last four cases of incorrect code generation in our AOSP buildbot and hopefully should turn it green. Reviewers: Meinersbur, bollu, gareevroman, efriedma, huihuiz, sebpop, simbuerg Reviewed By: Meinersbur Subscribers: pollydev, llvm-commits Tags: #polly Differential Revision: https://reviews.llvm.org/D33767 llvm-svn: 304808	2017-06-06 17:17:30 +00:00
Tobias Grosser	22be8a18f3	Add test coverage for regions with non-affine loops This adds test coverage for regions with non-affine loops, which we unfortunately missed when committing this features years ago. We will add more test coverage over time. llvm-svn: 304672	2017-06-03 23:39:02 +00:00
Siddharth Bhat	726c28f8c4	[CodeGen] Track trip counts per-scop for performance measurement. - Add a counter that is incremented once on exit from a scop. - Test cases got split into two: one to test the cycles, and another one to test trip counts. - Sample output: ```name=sample-output.txt scop function, entry block name, exit block name, total time, trip count warmup, %entry.split, %polly.merge_new_and_old, 5180, 1 f, %entry.split, %polly.merge_new_and_old, 409944, 500 g, %entry.split, %polly.merge_new_and_old, 1226, 1 ``` Differential Revision: https://reviews.llvm.org/D33822 llvm-svn: 304543	2017-06-02 11:36:52 +00:00
Siddharth Bhat	a4dea6bb05	[CodeGen] Print performance counter information in CSV. This ensures that tools can parse performance information which Polly generates easily. - Sample output: ```name=out.csv scop function, entry block name, exit block name, total time warmup, %entry.split, %polly.merge_new_and_old, 1960 f, %entry.split, %polly.merge_new_and_old, 1238 g, %entry.split, %polly.merge_new_and_old, 1218 ``` - Example code to parse output: ```lang=python, name=example-parse.py import asciitable import sys table = asciitable.read('out.csv', delimiter=',') asciitable.write(table, sys.stdout, delimiter=',') ``` llvm-svn: 304533	2017-06-02 09:20:02 +00:00

... 2 3 4 5 6 ...

1469 Commits