llvm-project

Commit Graph

Author	SHA1	Message	Date
Chandler Carruth	2f1fd1658f	[PM] Port ScalarEvolution to the new pass manager. This change makes ScalarEvolution a stand-alone object and just produces one from a pass as needed. Making this work well requires making the object movable, using references instead of overwritten pointers in a number of places, and other refactorings. I've also wired it up to the new pass manager and added a RUN line to a test to exercise it under the new pass manager. This includes basic printing support much like with other analyses. But there is a big and somewhat scary change here. Prior to this patch ScalarEvolution was never actually invalidated!!! Re-running the pass just re-wired up the various other analyses and didn't remove any of the existing entries in the SCEV caches or clear out anything at all. This might seem OK as everything in SCEV that can uses ValueHandles to track updates to the values that serve as SCEV keys. However, this still means that as we ran SCEV over each function in the module, we kept accumulating more and more SCEVs into the cache. At the end, we would have a SCEV cache with every value that we ever needed a SCEV for in the entire module!!! Yowzers. The releaseMemory routine would dump all of this, but that isn't realy called during normal runs of the pipeline as far as I can see. To make matters worse, there is actually a key that we don't update with value handles -- there is a map keyed off of Loops. Because LoopInfo does* release its memory from run to run, it is entirely possible to run SCEV over one function, then over another function, and then lookup a Loop* from the second function but find an entry inserted for the first function! Ouch. To make matters still worse, there are plenty of updates that don't trip a value handle. It seems incredibly unlikely that today GVN or another pass that invalidates SCEV can update values in just such a way that a subsequent run of SCEV will incorrectly find lookups in a cache, but it is theoretically possible and would be a nightmare to debug. With this refactoring, I've fixed all this by actually destroying and recreating the ScalarEvolution object from run to run. Technically, this could increase the amount of malloc traffic we see, but then again it is also technically correct. ;] I don't actually think we're suffering from tons of malloc traffic from SCEV because if we were, the fact that we never clear the memory would seem more likely to have come up as an actual problem before now. So, I've made the simple fix here. If in fact there are serious issues with too much allocation and deallocation, I can work on a clever fix that preserves the allocations (while clearing the data) between each run, but I'd prefer to do that kind of optimization with a test case / benchmark that shows why we need such cleverness (and that can test that we actually make it faster). It's possible that this will make some things faster by making the SCEV caches have higher locality (due to being significantly smaller) so until there is a clear benchmark, I think the simple change is best. Differential Revision: http://reviews.llvm.org/D12063 llvm-svn: 245193	2015-08-17 02:08:17 +00:00
Adam Nemet	5b0a479541	[LAA] Change name from addRuntimeCheck to addRuntimeChecks, NFC This was requested by Hal in D11205. llvm-svn: 244540	2015-08-11 00:09:37 +00:00
Adam Nemet	651a5a2401	[LAA] Remove unused pointer partition argument from needsChecking(), NFC This is no longer used in any of the callers. Also remove the logic of handling this argument. llvm-svn: 244421	2015-08-09 20:06:08 +00:00
Adam Nemet	385308877c	[LAA] Remove unused pointer partition argument from generateChecks, NFC LoopDistribution does its own filtering now. llvm-svn: 244420	2015-08-09 20:06:06 +00:00
Adam Nemet	155e8741f3	[LAA] Remove unused pointer partition argument from getNumberOfChecks, NFC This is unused after filtering checks was moved to the clients. As a result, we can just return the number of the checks in the precomputed set. llvm-svn: 244369	2015-08-07 22:44:21 +00:00
Adam Nemet	15840393f3	[LAA] Make the set of runtime checks part of the state of LAA, NFC This is the full set of checks that clients can further filter. IOW, it's client-agnostic. This makes LAA complete in the sense that it now provides the two main results of its analysis precomputed: 1. memory dependences via getDepChecker().getInsterestingDependences() 2. run-time checks via getRuntimePointerCheck().getChecks() However, as a consequence we now compute this information pro-actively. Thus if the client decides to skip the loop based on the dependences we've computed the checks unnecessarily. In order to see whether this was a significant overhead I checked compile time on SPEC2k6 LTO bitcode files. The change was in the noise. The checks are generated in canCheckPtrAtRT, at the same place where we used to call groupChecks to merge checks. llvm-svn: 244368	2015-08-07 22:44:15 +00:00
Adam Nemet	3a91e94734	[LAA] Remove unused pointer partition argument from print(), NFC This is now handled in the client. No need for LAA to provide this variant. llvm-svn: 244349	2015-08-07 19:44:48 +00:00
Adam Nemet	8701118792	[LAA] Remove unused pointer partition argument from addRuntimeCheck, NFC This variant of addRuntimeCheck is only used now from the LoopVectorizer which does not use this parameter. llvm-svn: 243955	2015-08-04 05:16:20 +00:00
Adam Nemet	53e30aec46	[LAA] Remove unused needsAnyChecking(), NFC llvm-svn: 243921	2015-08-03 23:33:03 +00:00
Craig Topper	e3dcce9700	De-constify pointers to Type since they can't be modified. NFC This was already done in most places a while ago. This just fixes the ones that crept in over time. llvm-svn: 243842	2015-08-01 22:20:21 +00:00
Silviu Baranga	4825060059	[LAA] Add clarifying comments for the checking pointer grouping algorithm. NFC llvm-svn: 243416	2015-07-28 13:44:08 +00:00
Adam Nemet	54f0b83ee2	[LAA] Split out a helper to print a collection of memchecks This is effectively an NFC but we can no longer print the index of the pointer group so instead I print its address. This still lets us cross-check the section that list the checks against the section that list the groups (see how I modified the test). E.g. before we printed this: Run-time memory checks: Check 0: Comparing group 0: %arrayidxC = getelementptr inbounds i16, i16* %c, i64 %store_ind %arrayidxC1 = getelementptr inbounds i16, i16* %c, i64 %store_ind_inc Against group 1: %arrayidxA = getelementptr i16, i16* %a, i64 %ind %arrayidxA1 = getelementptr i16, i16* %a, i64 %add ... Grouped accesses: Group 0: (Low: %c High: (78 + %c)) Member: {%c,+,4}<%for.body> Member: {(2 + %c),+,4}<%for.body> Now we print this (changes are underlined): Run-time memory checks: Check 0: Comparing group (0x7f9c6040c320): ~~~~~~~~~~~~~~ %arrayidxC1 = getelementptr inbounds i16, i16* %c, i64 %store_ind_inc %arrayidxC = getelementptr inbounds i16, i16* %c, i64 %store_ind Against group (0x7f9c6040c358): ~~~~~~~~~~~~~~ %arrayidxA1 = getelementptr i16, i16* %a, i64 %add %arrayidxA = getelementptr i16, i16* %a, i64 %ind ... Grouped accesses: Group 0x7f9c6040c320: ~~~~~~~~~~~~~~ (Low: %c High: (78 + %c)) Member: {(2 + %c),+,4}<%for.body> Member: {%c,+,4}<%for.body> llvm-svn: 243354	2015-07-27 23:54:41 +00:00
Adam Nemet	7c52e0527d	[LAA] Upper-case variable names, NFC llvm-svn: 243313	2015-07-27 19:38:50 +00:00
Adam Nemet	bbe1f1de16	[LAA] Split out a helper from addRuntimeCheck to generate the check, NFC llvm-svn: 243312	2015-07-27 19:38:48 +00:00
NAKAMURA Takumi	94abbbd6ab	LoopAccessAnalysis.cpp: Tweak r243239 to avoid side effects. It caused different emissions between gcc and clang. llvm-svn: 243258	2015-07-27 01:35:30 +00:00
Adam Nemet	1da7df3700	[LAA] Begin moving the logic of generating checks out of addRuntimeCheck Summary: The goal is to start moving us closer to the model where RuntimePointerChecking will compute and store the checks. Then a client can filter the check according to its requirements and then use the filtered list of checks with addRuntimeCheck. Before the patch, this is all done in addRuntimeCheck. So the patch starts to split up addRuntimeCheck while providing the old API under what's more or less a wrapper now. The new underlying addRuntimeCheck takes a collection of checks now, expands the code for the bounds then generates the code for the checks. I am not completely happy with making expandBounds static because now it needs so many explicit arguments but I don't want to make the type PointerBounds part of LAI. This should get fixed when addRuntimeCheck is moved to LoopVersioning where it really belongs, IMO. Audited the assembly diff of the testsuite (including externals). There is a tiny bit of assembly churn that is due to the different order the code for the bounds is expanded now (MultiSource/Benchmarks/Prolangs-C/bison/conflicts.s and with LoopDist on 456.hmmer/fast_algorithms.s). Reviewers: hfinkel Subscribers: klimek, llvm-commits Differential Revision: http://reviews.llvm.org/D11205 llvm-svn: 243239	2015-07-26 05:32:14 +00:00
Silviu Baranga	0e5804a6af	Fix memcheck interval ends for pointers with negative strides Summary: The checking pointer grouping algorithm assumes that the starts/ends of the pointers are well formed (start <= end). The runtime memory checking algorithm also assumes this by doing: start0 < end1 && start1 < end0 to detect conflicts. This check only works if start0 <= end0 and start1 <= end1. This change correctly orders the interval ends by either checking the stride (if it is constant) or by using min/max SCEV expressions. Reviewers: anemet, rengolin Subscribers: rengolin, llvm-commits Differential Revision: http://reviews.llvm.org/D11149 llvm-svn: 242400	2015-07-16 14:02:58 +00:00
Adam Nemet	041e6deb2c	[LAA] Split out a helper to check the pointer partitions, NFC This is made a static public member function to allow the transition of this logic from LAA to LoopDistribution. (Technically, it could be an implementation-local static function but then it would not be accessible from LoopDistribution.) llvm-svn: 242376	2015-07-16 02:48:05 +00:00
Adam Nemet	9f7dedc376	[LAA] Introduce RuntimePointerChecking::PointerInfo, NFC Turn this structure-of-arrays (i.e. the various pointer attributes) into array-of-structures. llvm-svn: 242219	2015-07-14 22:32:50 +00:00
Adam Nemet	7cdebac0c8	[LAA] Lift RuntimePointerCheck out of LoopAccessInfo, NFC I am planning to add more nested classes inside RuntimePointerCheck so all these triple-nesting would be hard to follow. Also rename it to RuntimePointerChecking (i.e. append 'ing'). llvm-svn: 242218	2015-07-14 22:32:44 +00:00
Silviu Baranga	a647c30f88	Cleanup after r241809 - remove uncessary call to std::sort Summary: The iteration order within a member of DepCands is deterministic and therefore we don't have to sort the accesses within a member. We also don't have to copy the indices of the pointers into a vector, since we can iterate over the members of the class. Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11145 llvm-svn: 242033	2015-07-13 14:48:24 +00:00
Adam Nemet	0f67c6c1d5	[LAA] Fix grammar in debug output llvm-svn: 241867	2015-07-09 22:17:41 +00:00
Adam Nemet	ee61474a61	[LAA] Hide NeedRTCheck logic completely inside canCheckPtrAtRT, NFC Currently canCheckPtrAtRT returns two flags NeedRTCheck and CanDoRT. NeedRTCheck says whether we need checks and CanDoRT whether we can generate the checks. The idea is to encode three states with these: Need/Can: (1) false/dont-care: no checks are needed (2) true/false: we need checks but can't generate them (3) true/true: we need checks and we can generate them This is pretty unnecessary since the caller (analyzeLoop) is only interested in whether we can generate the checks if we actually need them (i.e. 1 or 3). So this change cleans up to return just that (CanDoRTIfNeeded) and pulls all the underlying logic into canCheckPtrAtRT. By doing all this, we simplify analyzeLoop which is the complex function in LAA. There is further room for improvement here by using RtCheck.Need directly rather than a new local variable NeedRTCheck but that's for a later patch. llvm-svn: 241866	2015-07-09 22:17:38 +00:00
Silviu Baranga	ce3877fc8c	Don't rely on the DepCands iteration order when constructing checking pointer groups Summary: The checking pointer group construction algorithm relied on the iteration on DepCands. We would need the same leaders across runs and the same iteration order over the underlying std::set for determinism. This changes the algorithm to process the pointers in the order in which they were added to the runtime check, which is deterministic. We need to update the tests, since the order in which pointers appear has changed. No new tests were added, since it is impossible to test for non-determinism. Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11064 llvm-svn: 241809	2015-07-09 15:18:25 +00:00
Adam Nemet	b41d2d3fa3	[LAA] Fix line break in comment llvm-svn: 241785	2015-07-09 06:47:21 +00:00
Adam Nemet	5dc3b2cf53	[LAA] Rename IsRTNeeded to IsRTCheckAnalysisNeeded The original name was too close to NeedRTCheck which is what the actual memcheck analysis returns. This flag, as the new name suggests, is only used to whether to initiate that analysis. Also a comment is added to answer one question I had about this code for a long time. Namely, how does this flag differ from isDependencyCheckNeeded since they are seemingly set at the same time. llvm-svn: 241784	2015-07-09 06:47:18 +00:00
Adam Nemet	943befedf1	[LAA] Fix misleading use of word 'consecutive' Fix some places where the word consecutive is used but the code really means constant-stride (i.e. not just unit stride). llvm-svn: 241763	2015-07-09 00:03:22 +00:00
Adam Nemet	424edc6c80	[LAA] Revert a small part of r239295 This commit ([LAA] Fix estimation of number of memchecks) regressed the logic a bit. We shouldn't quit the analysis if we encounter a pointer without known bounds unless we actually need to emit a memcheck for it. The original code was using NumComparisons which is now computed differently. Instead I compute NeedRTCheck from NumReadPtrChecks and NumWritePtrChecks. As side note, I find the separation of NeedRTCheck and CanDoRT confusing, so I will try to merge them in a follow-up patch. llvm-svn: 241756	2015-07-08 22:58:48 +00:00
Adam Nemet	0131a5693a	[LAA] Add missing debug output after r239285 r239285 ([LoopAccessAnalysis] Teach LAA to check the memory dependence between strided accesses.) introduced a new case under MemoryDepChecker::isDependent. We normally have debug output for each case. llvm-svn: 241707	2015-07-08 18:47:38 +00:00
Silviu Baranga	1b6b50a921	[LAA] Merge memchecks for accesses separated by a constant offset Summary: Often filter-like loops will do memory accesses that are separated by constant offsets. In these cases it is common that we will exceed the threshold for the allowable number of checks. However, it should be possible to merge such checks, sice a check of any interval againt two other intervals separated by a constant offset (a,b), (a+c, b+c) will be equivalent with a check againt (a, b+c), as long as (a,b) and (a+c, b+c) overlap. Assuming the loop will be executed for a sufficient number of iterations, this will be true. If not true, checking against (a, b+c) is still safe (although not equivalent). As long as there are no dependencies between two accesses, we can merge their checks into a single one. We use this technique to construct groups of accesses, and then check the intervals associated with the groups instead of checking the accesses directly. Reviewers: anemet Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D10386 llvm-svn: 241673	2015-07-08 09:16:33 +00:00
David Blaikie	b447ac6435	Move VectorUtils from Transforms to Analysis to correct layering violation llvm-svn: 240804	2015-06-26 18:02:52 +00:00
Adam Nemet	c4866d29dd	[LAA] Try to prove non-wrapping of pointers if SCEV cannot Summary: Scalar evolution does not propagate the non-wrapping flags to values that are derived from a non-wrapping induction variable because the non-wrapping property could be flow-sensitive. This change is a first attempt to establish the non-wrapping property in some simple cases. The main idea is to look through the operations defining the pointer. As long as we arrive to a non-wrapping AddRec via a small chain of non-wrapping instruction, the pointer should not wrap either. I believe that this essentially is what Andy described in http://article.gmane.org/gmane.comp.compilers.llvm.cvs/220731 as the way forward. Reviewers: aschwaighofer, nadav, sanjoy, atrick Reviewed By: atrick Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D10472 llvm-svn: 240798	2015-06-26 17:25:43 +00:00
Chandler Carruth	ecbd16829a	[PM/AA] Remove the UnknownSize static member from AliasAnalysis. This is now living in MemoryLocation, which is what it pertains to. It is also an enum there rather than a static data member which is left never defined. llvm-svn: 239886	2015-06-17 07:21:38 +00:00
Chandler Carruth	ac80dc7532	[PM/AA] Remove the Location typedef from the AliasAnalysis class now that it is its own entity in the form of MemoryLocation, and update all the callers. This is an entirely mechanical change. References to "Location" within AA subclases become "MemoryLocation", and elsewhere "AliasAnalysis::Location" becomes "MemoryLocation". Hope that helps out-of-tree folks update. llvm-svn: 239885	2015-06-17 07:18:54 +00:00
Silviu Baranga	98a137196a	[LAA] Fix estimation of number of memchecks Summary: We need to add a runtime memcheck for pair of accesses (x,y) where at least one of x and y are writes. Assuming we have w writes and r reads, currently this number is estimated as being w* (w+r-1). This estimation will count (write,write) pairs twice and will overestimate the number of checks required. This change adds a getNumberOfChecks method to RuntimePointerCheck, which will count the number of runtime checks needed (similar in implementation to needsAnyChecking) and uses it to produce the correct number of runtime checks. Test Plan: llvm test suite spec2k spec2k6 Performance results: no changes observed (not surprising since the formula for 1 writer is basically the same, which would covers most cases - at least with the current check limit). Reviewers: anemet Reviewed By: anemet Subscribers: mzolotukhin, llvm-commits Differential Revision: http://reviews.llvm.org/D10217 llvm-svn: 239295	2015-06-08 10:27:06 +00:00
Hao Liu	32c0539691	[LoopVectorize] Teach Loop Vectorizor about interleaved memory accesses. Interleaved memory accesses are grouped and vectorized into vector load/store and shufflevector. E.g. for (i = 0; i < N; i+=2) { a = A[i]; // load of even element b = A[i+1]; // load of odd element ... // operations on a, b, c, d A[i] = c; // store of even element A[i+1] = d; // store of odd element } The loads of even and odd elements are identified as an interleave load group, which will be transfered into vectorized IRs like: %wide.vec = load <8 x i32>, <8 x i32>* %ptr %vec.even = shufflevector <8 x i32> %wide.vec, <8 x i32> undef, <4 x i32> <i32 0, i32 2, i32 4, i32 6> %vec.odd = shufflevector <8 x i32> %wide.vec, <8 x i32> undef, <4 x i32> <i32 1, i32 3, i32 5, i32 7> The stores of even and odd elements are identified as an interleave store group, which will be transfered into vectorized IRs like: %interleaved.vec = shufflevector <4 x i32> %vec.even, %vec.odd, <8 x i32> <i32 0, i32 4, i32 1, i32 5, i32 2, i32 6, i32 3, i32 7> store <8 x i32> %interleaved.vec, <8 x i32>* %ptr This optimization is currently disabled by defaut. To try it by adding '-enable-interleaved-mem-accesses=true'. llvm-svn: 239291	2015-06-08 06:39:56 +00:00
Hao Liu	751004a67d	[LoopAccessAnalysis] Teach LAA to check the memory dependence between strided accesses. Differential Revision: http://reviews.llvm.org/D9368 llvm-svn: 239285	2015-06-08 04:48:37 +00:00
Chandler Carruth	70c61c1a8a	[PM/AA] Start refactoring AliasAnalysis to remove the analysis group and port it to the new pass manager. All this does is extract the inner "location" class used by AA into its own full fledged type. This seems much cleaner as MemoryDependence and soon MemorySSA also use this heavily, and it doesn't make much sense being inside the AA infrastructure. This will also make it much easier to break apart the AA infrastructure into something that stands on its own rather than using the analysis group design. There are a few places where this makes APIs not make sense -- they were taking an AliasAnalysis pointer just to build locations. I'll try to clean those up in follow-up commits. Differential Revision: http://reviews.llvm.org/D10228 llvm-svn: 239003	2015-06-04 02:03:15 +00:00
Adam Nemet	df3dc5b9ca	[LoopAccesses] If shouldRetryWithRuntimeCheck, reset InterestingDependences When dependence analysis encounters a non-constant distance between memory accesses it aborts the analysis and falls back to run-time checks only. In this case we weren't resetting the array of dependences. llvm-svn: 237574	2015-05-18 15:37:03 +00:00
Adam Nemet	c3384320f2	[LoopAccesses] Rearrange printed lines in -analyze "Store to invariant address..." is moved as the last line. This is not the prime result of the analysis. Plus it simplifies some of the tests. llvm-svn: 237573	2015-05-18 15:36:57 +00:00
Adam Nemet	f10ca27884	[LoopAccesses] Debug improvement Report pointers with unknown bounds. llvm-svn: 237572	2015-05-18 15:36:52 +00:00
Adam Nemet	e2b885c4bc	[getUnderlyingOjbects] Analyze loop PHIs further to remove false positives Specifically, if a pointer accesses different underlying objects in each iteration, don't look through the phi node defining the pointer. The motivating case is the underlyling-objects-2.ll testcase. Consider the loop nest: int *A; for (i) for (j) A[i][j] = A[i-1][j] B[j] This loop is transformed by Load-PRE to stash away A[i] for the next iteration of the outer loop: Curr = A[0]; // Prev_0 for (i: 1..N) { Prev = Curr; // Prev = PHI (Prev_0, Curr) Curr = A[i]; for (j: 0..N) Curr[j] = Prev[j] * B[j] } Since A[i] and A[i-1] are likely to be independent pointers, getUnderlyingObjects should not assume that Curr and Prev share the same underlying object in the inner loop. If it did we would try to dependence-analyze Curr and Prev and the analysis of the corresponding SCEVs would fail with non-constant distance. To fix this, the getUnderlyingObjects API is extended with an optional LoopInfo parameter. This is effectively what controls whether we want the above behavior or the original. Currently, I only changed to use this approach for LoopAccessAnalysis. The other testcase is to guard the opposite case where we do want to look through the loop PHI. If we step through an array by incrementing a pointer, the underlying object is the incoming value of the phi as the loop is entered. Fixes rdar://problem/19566729 llvm-svn: 235634	2015-04-23 20:09:20 +00:00
Adam Nemet	8dcb3b6a59	[LoopAccesses] Improve debug output llvm-svn: 235238	2015-04-17 22:43:10 +00:00
Adam Nemet	26da8e9800	[LoopAccesses] Properly print whether memchecks are needed Fix oversight in -analyze output. PtrRtCheck contains the pointers that need to be checked against each other and not whether memchecks are necessary. For instance in the testcase PtrRtCheck has four elements but all no-alias so no checking is necessary. llvm-svn: 234833	2015-04-14 01:12:55 +00:00
Adam Nemet	ce48250f11	[LoopAccesses] Allow analysis to complete in the presence of uniform stores (Re-apply r234361 with a fix and a testcase for PR23157) Both run-time pointer checking and the dependence analysis are capable of dealing with uniform addresses. I.e. it's really just an orthogonal property of the loop that the analysis computes. Run-time pointer checking will only try to reason about SCEVAddRec pointers or else gives up. If the uniform pointer turns out the be a SCEVAddRec in an outer loop, the run-time checks generated will be correct (start and end bounds would be equal). In case of the dependence analysis, we work again with SCEVs. When compared against a loop-dependent address of the same underlying object, the difference of the two SCEVs won't be constant. This will result in returning an Unknown dependence for the pair. When compared against another uniform access, the difference would be constant and we should return the right type of dependence (forward/backward/etc). The changes also adds support to query this property of the loop and modify the vectorizer to use this. Patch by Ashutosh Nema! llvm-svn: 234424	2015-04-08 17:48:40 +00:00
Adam Nemet	e09a928c80	Revert "[LoopAccesses] Allow analysis to complete in the presence of uniform stores" This reverts commit r234361. It caused PR23157. llvm-svn: 234387	2015-04-08 04:16:55 +00:00
Adam Nemet	0515c33b70	[LoopAccesses] Allow analysis to complete in the presence of uniform stores Both run-time pointer checking and the dependence analysis are capable of dealing with uniform addresses. I.e. it's really just an orthogonal property of the loop that the analysis computes. Run-time pointer checking will only try to reason about SCEVAddRec pointers or else gives up. If the uniform pointer turns out the be a SCEVAddRec in an outer loop, the run-time checks generated will be correct (start and end bounds would be equal). In case of the dependence analysis, we work again with SCEVs. When compared against a loop-dependent address of the same underlying object, the difference of the two SCEVs won't be constant. This will result in returning an Unknown dependence for the pair. When compared against another uniform access, the difference would be constant and we should return the right type of dependence (forward/backward/etc). The changes also adds support to query this property of the loop and modify the vectorizer to use this. Patch by Ashutosh Nema! llvm-svn: 234361	2015-04-07 21:46:16 +00:00
Adam Nemet	51870d16e4	[LoopAccesses] New API to query if memchecks are necessary after partitioning This is used by Loop Distribution. llvm-svn: 234283	2015-04-07 03:35:26 +00:00
Adam Nemet	90fec840eb	[LoopAccesses] Handle case when no memchecks are needed after partitioning llvm-svn: 233930	2015-04-02 17:51:57 +00:00
Benjamin Kramer	799003bf8c	Re-sort includes with sort-includes.py and insert raw_ostream.h where it's used. llvm-svn: 232998	2015-03-23 19:32:43 +00:00
Michael Zolotukhin	9b3cf604ce	LoopVectorize: teach loop vectorizer to vectorize calls. The tests would be committed in a commit for http://reviews.llvm.org/D8131 Review: http://reviews.llvm.org/D8095 llvm-svn: 232530	2015-03-17 19:46:50 +00:00
Adam Nemet	4bb90a71de	[LoopAccesses] Add debug message to indicate the result of the analysis The debug message was pretty confusing here. It only reported the situation with memchecks without the result of the dependence analysis. Now it prints whether the loop is safe from the POV of the dependence analysis and if yes, whether we need memchecks. llvm-svn: 231854	2015-03-10 21:47:39 +00:00
David Majnemer	d388e930ce	LoopAccessAnalysis: Silence -Wreturn-type diagnostic from GCC llvm-svn: 231836	2015-03-10 20:23:29 +00:00
Adam Nemet	949e91a6fa	[LAA-memchecks] Comment improvement I forgot to roll this into r231816. It was requested by Hal in D8122. llvm-svn: 231821	2015-03-10 19:12:41 +00:00
Adam Nemet	ec1e2bb6a4	[LAA-memchecks 3/3] Introduce pointer partitions for memchecks This is the final patch that actually introduces the new parameter of partition mapping to RuntimePointerCheck::needsChecking. Another API (LAI::getInstructionsForAccess) is also exposed that helps to map pointers to instructions because ultimately we partition instructions. The WIP version of the Loop Distribution pass in D6930 has been adapted to use all this. See for example, how InstrPartitionContainer::computePartitionSetForPointers sets up the partitions using the above API and then calls to LAI::addRuntimeCheck with the pointer partitions. llvm-svn: 231818	2015-03-10 18:54:26 +00:00
Adam Nemet	98c4c5dd78	[LAA-memchecks 2/3] Move number of memcheck threshold checking to LV Now the analysis won't "fail" if the memchecks exceed the threshold. It is the transform pass' responsibility to perform the check. This allows the transform pass to further analyze/eliminate the memchecks. E.g. in Loop distribution we only need to check pointers that end up in different partitions. Note that there is a slight change of functionality here. The logic in analyzeLoop is that if dependence checking fails due to non-constant distance between the pointers, another attempt is made to prove safety of the dependences purely using run-time checks. Before this patch we could fail the loop due to exceeding the memcheck threshold after the first step, now we only check the threshold in the client after the full analysis. There is no measurable compile-time effect but I wanted to record this here. llvm-svn: 231817	2015-03-10 18:54:23 +00:00
Adam Nemet	b6dc76ffe5	[LAA-memchecks 1/3] Split out NumComparisons checks. NFC The check for the number of memchecks will be moved to the client of this analysis. Besides allowing for transform-specific thresholds, this also lets Loop Distribution post-process the memchecks; Loop Distribution only needs memchecks between pointers of different partitions. The motivation for this first patch is to untangle the CanDoRT check from the NumComparison check before moving the NumComparison part. CanDoRT means that we couldn't determine the bounds for the pointer. Note that NumComparison is set independent of this flag. llvm-svn: 231816	2015-03-10 18:54:19 +00:00
Adam Nemet	58913d65ad	[LoopAccesses 3/3] Print the dependences with -analyze The dependences are now expose through the new getInterestingDependences API so we can use that with -analyze too and fix the FIXME. This lets us remove the test that relied on -debug to check the dependences. llvm-svn: 231807	2015-03-10 17:40:43 +00:00
Adam Nemet	9c92657971	[LoopAccesses 2/3] Allow querying of interesting dependences Gather an array of interesting dependences rather than just failing after the first unsafe one and regarding the loop unsafe. Loop Distribution needs to be able to collect all dependences in order to isolate the dependence cycles into their own partition. Since the dependence checking algorithm is quadratic in terms of accesses sharing the same underlying pointer, I am applying a cut-off threshold (MaxInterestingDependence). Exceeding that, the logic reverts back to the original approach deeming the loop unsafe upon encountering the first unsafe dependence. The main idea of the patch is to split isDepedent from directly answering the question whether the dep is safe for vectorization to return a dependence type which then gets mapped to old boolean result using Dependence::isSafeForVectorization. Tested that this was compile-time neutral on SpecINT2006 LTO bitcode inputs. No assembly change on the testsuite including external. llvm-svn: 231806	2015-03-10 17:40:37 +00:00
Adam Nemet	dee666bc63	[LoopAccesses 1/3] Expose MemoryDepChecker to LAA users LoopDistribution needs to query various results of the dependence analysis. This series will expose some more APIs and state of the dependence checker. This patch is a simple one to just expose the DepChecker instance. The set is compile-time neutral measured with LTO bitcode files of SpecINT2006. Also there is no assembly change on the testsuite. llvm-svn: 231805	2015-03-10 17:40:34 +00:00
Mehdi Amini	a28d91d81b	DataLayout is mandatory, update the API to reflect it with references. Summary: Now that the DataLayout is a mandatory part of the module, let's start cleaning the codebase. This patch is a first attempt at doing that. This patch is not exactly NFC as for instance some places were passing a nullptr instead of the DataLayout, possibly just because there was a default value on the DataLayout argument to many functions in the API. Even though it is not purely NFC, there is no change in the validation. I turned as many pointer to DataLayout to references, this helped figuring out all the places where a nullptr could come up. I had initially a local version of this patch broken into over 30 independant, commits but some later commit were cleaning the API and touching part of the code modified in the previous commits, so it seemed cleaner without the intermediate state. Test Plan: Reviewers: echristo Subscribers: llvm-commits From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 231740	2015-03-10 02:37:25 +00:00
Mehdi Amini	46a43556db	Make DataLayout Non-Optional in the Module Summary: DataLayout keeps the string used for its creation. As a side effect it is no longer needed in the Module. This is "almost" NFC, the string is no longer canonicalized, you can't rely on two "equals" DataLayout having the same string returned by getStringRepresentation(). Get rid of DataLayoutPass: the DataLayout is in the Module The DataLayout is "per-module", let's enforce this by not duplicating it more than necessary. One more step toward non-optionality of the DataLayout in the module. Make DataLayout Non-Optional in the Module Module->getDataLayout() will never returns nullptr anymore. Reviewers: echristo Subscribers: resistor, llvm-commits, jholewinski Differential Revision: http://reviews.llvm.org/D7992 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 231270	2015-03-04 18:43:29 +00:00
Adam Nemet	9cc0c3999d	[LV/LoopAccesses] Backward dependences are not safe just because the accesses are via different types Noticed this while generalizing the code for loop distribution. I confirmed with Arnold that this was indeed a bug and managed to create a testcase. llvm-svn: 230647	2015-02-26 17:58:48 +00:00
Adam Nemet	1d862af764	[LoopAccesses] Add command-line option for RuntimeMemoryCheckThreshold Also remove the somewhat misleading initializers from VectorizationFactor and VectorizationInterleave. They will get initialized with the default ctor since no cl::init is provided. llvm-svn: 230608	2015-02-26 04:39:09 +00:00
Adam Nemet	8bc61df9f2	[LoopAccesses] LAA::getInfo to use const reference for stride parameter And other required const-correctness fixes to make this work. llvm-svn: 230289	2015-02-24 00:41:59 +00:00
Adam Nemet	57ac766ee9	[LoopAccesses] Change LAA:getInfo to return a constant reference As expected, this required a few more const-correctness fixes. Based on Hal's feedback on D7684. llvm-svn: 229899	2015-02-19 19:15:21 +00:00
Adam Nemet	e91cc6ef93	[LoopAccesses] Add -analyze support The LoopInfo in combination with depth_first is used to enumerate the loops. Right now -analyze is not yet complete. It only prints the result of the analysis, the report and the run-time checks. Printing the unsafe depedences will require a bit more reshuffling which I'd like to do in a follow-on to this patchset. Unsafe dependences are currently checked via -debug-only=loop-accesses in the new test. This is part of the patchset that converts LoopAccessAnalysis into an actual analysis pass. llvm-svn: 229898	2015-02-19 19:15:19 +00:00
Adam Nemet	2bd6e984ef	[LoopAccesses] Split out LoopAccessReport from VectorizerReport The only difference between these two is that VectorizerReport adds a vectorizer-specific prefix to its messages. When LAA is used in the vectorizer context the prefix is added when we promote the LoopAccessReport into a VectorizerReport via one of the constructors. This is part of the patchset that converts LoopAccessAnalysis into an actual analysis pass. llvm-svn: 229897	2015-02-19 19:15:15 +00:00
Adam Nemet	3e87634fd8	[LoopAccesses] Add missing const to APIs in VectorizationReport When I split out LoopAccessReport from this, I need to create some temps so constness becomes necessary. This is part of the patchset that converts LoopAccessAnalysis into an actual analysis pass. llvm-svn: 229896	2015-02-19 19:15:13 +00:00
Adam Nemet	929c38e8ff	[LoopAccesses] Add canAnalyzeLoop This allows the analysis to be attempted with any loop. This feature will be used with -analysis. (LV only requests the analysis on loops that have already satisfied these tests.) This is part of the patchset that converts LoopAccessAnalysis into an actual analysis pass. llvm-svn: 229895	2015-02-19 19:15:10 +00:00
Adam Nemet	339f42b396	[LoopAccesses] Change debug messages from LV to LAA Also add pass name as an argument to VectorizationReport::emitAnalysis. This is part of the patchset that converts LoopAccessAnalysis into an actual analysis pass. llvm-svn: 229894	2015-02-19 19:15:07 +00:00
Adam Nemet	3bfd93d789	[LoopAccesses] Create the analysis pass This is a function pass that runs the analysis on demand. The analysis can be initiated by querying the loop access info via LAA::getInfo. It either returns the cached info or runs the analysis. Symbolic stride information continues to reside outside of this analysis pass. We may move it inside later but it's not a priority for me right now. The idea is that Loop Distribution won't support run-time stride checking at least initially. This means that when querying the analysis, symbolic stride information can be provided optionally. Whether stride information is used can invalidate the cache entry and rerun the analysis. Note that if the loop does not have any symbolic stride, the entry should be preserved across Loop Distribution and LV. Since currently the only user of the pass is LV, I just check that the symbolic stride information didn't change when using a cached result. On the LV side, LoopVectorizationLegality requests the info object corresponding to the loop from the analysis pass. A large chunk of the diff is due to LAI becoming a pointer from a reference. A test will be added as part of the -analyze patch. Also tested that with AVX, we generate identical assembly output for the testsuite (including the external testsuite) before and after. This is part of the patchset that converts LoopAccessAnalysis into an actual analysis pass. llvm-svn: 229893	2015-02-19 19:15:04 +00:00
Adam Nemet	436018c3ff	[LoopAccesses] Cache the result of canVectorizeMemory LAA will be an on-demand analysis pass, so we need to cache the result of the analysis. canVectorizeMemory is renamed to analyzeLoop which computes the result. canVectorizeMemory becomes the query function for the cached result. This is part of the patchset that converts LoopAccessAnalysis into an actual analysis pass. llvm-svn: 229892	2015-02-19 19:15:00 +00:00
Adam Nemet	c922853b93	[LoopAccesses] Stash the report from the analysis rather than emitting it The transformation passes will query this and then emit them as part of their own report. The currently only user LV is modified to do just that. This is part of the patchset that converts LoopAccessAnalysis into an actual analysis pass. llvm-svn: 229891	2015-02-19 19:14:56 +00:00
Adam Nemet	f219c64723	[LoopAccesses] Make VectorizerParams global + fix for cyclic dep As LAA is becoming a pass, we can no longer pass the params to its constructor. This changes the command line flags to have external storage. These can now be accessed both from LV and LAA. VectorizerParams is moved out of LoopAccessInfo in order to shorten the code to access it. This commits also has the fix (D7731) to the break dependence cycle between the analysis and vector libraries. This is part of the patchset that converts LoopAccessAnalysis into an actual analysis pass. llvm-svn: 229890	2015-02-19 19:14:52 +00:00
Adam Nemet	04d4163e95	Revert "Reformat." This reverts commit r229651. I'd like to ultimately revert r229650 but this reformat stands in the way. I'll reformat the affected files once the the loop-access pass is fully committed. llvm-svn: 229889	2015-02-19 19:14:34 +00:00
NAKAMURA Takumi	a250484c4c	Reformat. llvm-svn: 229651	2015-02-18 08:36:14 +00:00
NAKAMURA Takumi	fa520c5f49	Revert r229622: "[LoopAccesses] Make VectorizerParams global" and others. r229622 brought cyclic dependencies between Analysis and Vector. r229622: "[LoopAccesses] Make VectorizerParams global" r229623: "[LoopAccesses] Stash the report from the analysis rather than emitting it" r229624: "[LoopAccesses] Cache the result of canVectorizeMemory" r229626: "[LoopAccesses] Create the analysis pass" r229628: "[LoopAccesses] Change debug messages from LV to LAA" r229630: "[LoopAccesses] Add canAnalyzeLoop" r229631: "[LoopAccesses] Add missing const to APIs in VectorizationReport" r229632: "[LoopAccesses] Split out LoopAccessReport from VectorizerReport" r229633: "[LoopAccesses] Add -analyze support" r229634: "[LoopAccesses] Change LAA:getInfo to return a constant reference" r229638: "Analysis: fix buildbots" llvm-svn: 229650	2015-02-18 08:34:47 +00:00
Saleem Abdulrasool	90b1d152b5	Analysis: fix buildbots This should fix the compilation failure on the MSVC buildbots which find a std::make_unique and llvm::make_unique via ADL, resulting in ambiguity. llvm-svn: 229638	2015-02-18 05:09:50 +00:00
Adam Nemet	85fd9f8d09	[LoopAccesses] Change LAA:getInfo to return a constant reference As expected, this required a few more const-correctness fixes. Based on Hal's feedback on D7684. llvm-svn: 229634	2015-02-18 03:44:33 +00:00
Adam Nemet	75bc2d111f	[LoopAccesses] Add -analyze support The LoopInfo in combination with depth_first is used to enumerate the loops. Right now -analyze is not yet complete. It only prints the result of the analysis, the report and the run-time checks. Printing the unsafe depedences will require a bit more reshuffling which I'd like to do in a follow-on to this patchset. Unsafe dependences are currently checked via -debug-only=loop-accesses in the new test. This is part of the patchset that converts LoopAccessAnalysis into an actual analysis pass. llvm-svn: 229633	2015-02-18 03:44:30 +00:00
Adam Nemet	d7350dbb85	[LoopAccesses] Split out LoopAccessReport from VectorizerReport The only difference between these two is that VectorizerReport adds a vectorizer-specific prefix to its messages. When LAA is used in the vectorizer context the prefix is added when we promote the LoopAccessReport into a VectorizerReport via one of the constructors. This is part of the patchset that converts LoopAccessAnalysis into an actual analysis pass. llvm-svn: 229632	2015-02-18 03:44:25 +00:00
Adam Nemet	8b12afbeee	[LoopAccesses] Add missing const to APIs in VectorizationReport When I split out LoopAccessReport from this, I need to create some temps so constness becomes necessary. This is part of the patchset that converts LoopAccessAnalysis into an actual analysis pass. llvm-svn: 229631	2015-02-18 03:44:20 +00:00
Adam Nemet	450d417ecf	[LoopAccesses] Add canAnalyzeLoop This allows the analysis to be attempted with any loop. This feature will be used with -analysis. (LV only requests the analysis on loops that have already satisfied these tests.) This is part of the patchset that converts LoopAccessAnalysis into an actual analysis pass. llvm-svn: 229630	2015-02-18 03:44:08 +00:00
Adam Nemet	a8945b7790	[LoopAccesses] Factor out RuntimePointerCheck::needsChecking Will be used by the new RuntimePointerCheck::print. This is part of the patchset that converts LoopAccessAnalysis into an actual analysis pass. llvm-svn: 229629	2015-02-18 03:43:58 +00:00
Adam Nemet	d0db4c1395	[LoopAccesses] Change debug messages from LV to LAA Also add pass name as an argument to VectorizationReport::emitAnalysis. This is part of the patchset that converts LoopAccessAnalysis into an actual analysis pass. llvm-svn: 229628	2015-02-18 03:43:37 +00:00
Adam Nemet	d6b7e29815	[LoopAccesses] Create the analysis pass This is a function pass that runs the analysis on demand. The analysis can be initiated by querying the loop access info via LAA::getInfo. It either returns the cached info or runs the analysis. Symbolic stride information continues to reside outside of this analysis pass. We may move it inside later but it's not a priority for me right now. The idea is that Loop Distribution won't support run-time stride checking at least initially. This means that when querying the analysis, symbolic stride information can be provided optionally. Whether stride information is used can invalidate the cache entry and rerun the analysis. Note that if the loop does not have any symbolic stride, the entry should be preserved across Loop Distribution and LV. Since currently the only user of the pass is LV, I just check that the symbolic stride information didn't change when using a cached result. On the LV side, LoopVectorizationLegality requests the info object corresponding to the loop from the analysis pass. A large chunk of the diff is due to LAI becoming a pointer from a reference. A test will be added as part of the -analyze patch. Also tested that with AVX, we generate identical assembly output for the testsuite (including the external testsuite) before and after. This is part of the patchset that converts LoopAccessAnalysis into an actual analysis pass. llvm-svn: 229626	2015-02-18 03:43:24 +00:00
Adam Nemet	01abb2c355	[LoopAccesses] Make blockNeedsPredication static blockNeedsPredication is in LoopAccess in order to share it with the vectorizer. It's a utility needed by LoopAccess not strictly provided by it but it's a good place to share it. This makes the function static so that it no longer required to create an LoopAccessInfo instance in order to access it from LV. This was actually causing problems because it would have required creating LAI much earlier that LV::canVectorizeMemory(). This is part of the patchset that converts LoopAccessAnalysis into an actual analysis pass. llvm-svn: 229625	2015-02-18 03:43:19 +00:00
Adam Nemet	3cf32ad6db	[LoopAccesses] Cache the result of canVectorizeMemory LAA will be an on-demand analysis pass, so we need to cache the result of the analysis. canVectorizeMemory is renamed to analyzeLoop which computes the result. canVectorizeMemory becomes the query function for the cached result. This is part of the patchset that converts LoopAccessAnalysis into an actual analysis pass. llvm-svn: 229624	2015-02-18 03:42:57 +00:00
Adam Nemet	5474be2c80	[LoopAccesses] Stash the report from the analysis rather than emitting it The transformation passes will query this and then emit them as part of their own report. The currently only user LV is modified to do just that. This is part of the patchset that converts LoopAccessAnalysis into an actual analysis pass. llvm-svn: 229623	2015-02-18 03:42:50 +00:00
Adam Nemet	4f3ede5a01	[LoopAccesses] Make VectorizerParams global As LAA is becoming a pass, we can no longer pass the params to its constructor. This changes the command line flags to have external storage. These can now be accessed both from LV and LAA. VectorizerParams is moved out of LoopAccessInfo in order to shorten the code to access it. This is part of the patchset that converts LoopAccessAnalysis into an actual analysis pass. llvm-svn: 229622	2015-02-18 03:42:43 +00:00
Adam Nemet	30f16e1696	[LoopAccesses] Rename LoopAccessAnalysis to LoopAccessInfo LoopAccessAnalysis will be used as the name of the pass. This is part of the patchset that converts LoopAccessAnalysis into an actual analysis pass. llvm-svn: 229621	2015-02-18 03:42:35 +00:00
Adam Nemet	7206d7a5d2	[LV] Move addRuntimeCheck to LoopAccessAnalysis This will allow it to be shared with the new Loop Distribution pass. getFirstInst is currently duplicated across LoopVectorize.cpp and LoopAccessAnalysis.cpp. This is a short-term work-around until we figure out a better solution. NFC. (The code moved is adjusted a bit for the name of the Loop member and that PtrRtCheck is now a reference rather than a pointer.) llvm-svn: 228418	2015-02-06 18:31:04 +00:00
Adam Nemet	0456327cfb	[LoopVectorize] Move LoopAccessAnalysis to its own module Other than moving code and adding the boilerplate for the new files, the code being moved is unchanged. There are a few global functions that are shared with the rest of the LoopVectorizer. I moved these to the new module as well (emitLoopAnalysis, stripIntegerCast, replaceSymbolicStrideSCEV) along with the Report class used by emitLoopAnalysis. There is probably room for further improvement in this area. I kept DEBUG_TYPE "loop-vectorize" because it's used as the PassName with emitOptimizationRemarkAnalysis. This will obviously have to change. NFC. This is part of the patchset that splits out the memory dependence logic from LoopVectorizationLegality into a new class LoopAccessAnalysis. LoopAccessAnalysis will be used by the new Loop Distribution pass. llvm-svn: 227756	2015-02-01 16:56:15 +00:00

1 2 3

144 Commits