llvm-project

Commit Graph

Author	SHA1	Message	Date
Tobias Grosser	cc43087afc	[DependenceInfo] Simplify creation and subsequent use of AccessSchedule [NFC] We only ever use the wrapped domain of AccessSchedule, so stop creating an entire union_map and then pulling the domain out. Reviewers: grosser Tags: #polly Contributed-by: Siddharth Bhat <siddu.druid@gmail.com> Differential Revision: https://reviews.llvm.org/D30179 llvm-svn: 295726	2017-02-21 15:38:31 +00:00
Michael Kruse	9e52c39f0a	[DeLICM] Map values hoisted by LICM back to the array. Implement the -polly-delicm pass. The pass intends to undo the effects of LoopInvariantCodeMotion (LICM) which adds additional scalar dependencies into SCoPs. DeLICM will try to map those scalars back to the array elements they were promoted from, as long as the array element is unused. The is the main patch from the DeLICM/DePRE patch series. It does not yet undo GVN PRE for which additional information about known values is needed and does not handle PHI write accesses that have have no target. As such its usefulness is limited. Patches for these issues including regression tests for error situatons will follow. Reviewers: grosser Differential Revision: https://reviews.llvm.org/D24716 llvm-svn: 295713	2017-02-21 10:20:54 +00:00
Michael Kruse	d9cdeb453d	[Cmake] Bump required cmake version to 3.4.3. This is currently the minimum required version by LLVM. Since LLVM is needed to build Polly, we also require at least that version. Suggested-by: Philip Pfaffe <philip.pfaffe@gmail.com> llvm-svn: 295672	2017-02-20 17:06:31 +00:00
Michael Kruse	5ab24fdb73	[Cmake] Install the isl headers into the install tree. isl headers are currently missing in a Polly installation. Because the Polly headers depend on those, code can't be compiled against an installed Polly. This patch installs the isl headers. I left a TODO, as optionally it should be possible to use a system version of isl instead of the one shipped with Polly. When compiling, clients of the installation need to add -I${PREFIX}/include/polly/ to there include path right now, because there currently is no way to export this path automatically. Contributed-by: Philip Pfaffe <philip.pfaffe@gmail.com> Differential Revision: https://reviews.llvm.org/D29931 llvm-svn: 295671	2017-02-20 16:57:14 +00:00
Tobias Grosser	079d511891	[ScopInfo] Count read-only arrays when computing complexity of alias check Instead of counting the number of read-only accesses, we now count the number of distinct read-only array references when checking if a run-time alias check may be too complex. The run-time alias check is quadratic in the number of base pointers, not the number of accesses. Before this change we accidentally skipped SPEC's lbm test case. llvm-svn: 295567	2017-02-18 20:51:29 +00:00
Tobias Grosser	28492b85e2	[DependenceInfo] Pull out statement [NFC] This simplifies the code slightly. llvm-svn: 295551	2017-02-18 16:41:28 +00:00
Tobias Grosser	8ee46985d2	[Dependences] Compute reduction dependences on schedule tree [NFC] This change gets rid of the need for zero padding, makes the reduction computation code more similar to the normal dependence computation, and also better documents what we do at the moment. Making the dependence computation for reductions a little bit easier to understand will hopefully help us to further reduce code duplication. This reduces the time spent only in the reduction dependence pass from 260ms to 150ms for test/DependenceInfo/reduction_sequence.ll. This is a reduction of over 40% in dependence computation time. This change was inspired by discussions with Michael Kruse, Utpal Bora, Siddharth Bhat, and Johannes Doerfert. It can hopefully lay the base for further cleanups of the reduction code. llvm-svn: 295550	2017-02-18 16:39:04 +00:00
Tobias Grosser	41f0d81b31	[test] Add reduction sequence test case [NFC] This test case is a mini performance test case that shows the time needed for a couple of simple reductions. It takes today about 325ms on my machine to run this test case through 'opt' with scop construction and reduction detection. It can be used as mini-proxy for further tuning of the reduction code. Generally we do not commit performance test cases, but as this is very small and also very fast it seems OK to keep it in the lit test suite. This test case will also help to verify that future changes to the reduction code will not affect the ordering of the reduction sets and will consequently not cause spurious performance changes that only result from reordering of dependences in the reduction set. llvm-svn: 295549	2017-02-18 16:38:58 +00:00
Tobias Grosser	2461021150	Drop leftover debug statement llvm-svn: 295444	2017-02-17 13:39:45 +00:00
Tobias Grosser	cd01a363d6	[ScopInfo] Add statistics to count loops after scop modeling llvm-svn: 295431	2017-02-17 08:12:36 +00:00
Tobias Grosser	65ce9362b8	[ScopDetection] Compute the maximal loop depth correctly Before this change, we obtained loop depth numbers that were deeper then the actual loop depth. llvm-svn: 295430	2017-02-17 08:08:54 +00:00
Tobias Grosser	72745c2ef5	Updated isl to isl-0.18-254-g6bc184d This update includes a couple more coalescing changes as well as a large number of isl-internal code cleanups (dead assigments, ...). llvm-svn: 295419	2017-02-17 05:11:16 +00:00
Tobias Grosser	ca2cfd0bd8	[ScopInfo] Do not try to fold array dimensions of size zero Trying to fold such kind of dimensions will result in a division by zero, which crashes the compiler. As such arrays are likely to invalidate the scop anyhow (but are not illegal in LLVM-IR), there is no point in trying to optimize the array layout. Hence, we just avoid the folding of constant dimensions of size zero. llvm-svn: 295415	2017-02-17 04:48:52 +00:00
Tobias Grosser	90411a967b	[ScopInfo] Rename MaxDisjunctions -> MaxDisjuncts [NFC] There is only a single disjunction. However, we bound the number of 'disjuncts' in this disjunction. Name the variable accordingly. llvm-svn: 295362	2017-02-16 19:11:33 +00:00
Tobias Grosser	76ec194951	[tests] Fix some misspellings [NFC] llvm-svn: 295361	2017-02-16 19:11:29 +00:00
Tobias Grosser	c8a8276710	[ScopInfo] Bound the number of disjuncts in context Before this change wrapping range metadata resulted in exponential growth of the context, which made context construction of large scops very slow. Instead, we now just do not model the range information precisely, in case the number of disjuncts in the context has already reached a certain limit. llvm-svn: 295360	2017-02-16 19:11:25 +00:00
Tobias Grosser	98a3aa4f19	[ScopInfo] Use uppercase variable name [NFC] llvm-svn: 295350	2017-02-16 18:39:18 +00:00
Tobias Grosser	3281f601bb	[ScopInfo] Always derive upper and lower bounds for parameters Commit r230230 introduced the use of range metadata to derive bounds for parameters, instead of just looking at the type of the parameter. As part of this commit support for wrapping ranges was added, where the lower bound of a parameter is larger than the upper bound: { 255 < p \|\| p < 0 } However, at the same time, for wrapping ranges support for adding bounds given by the size of the containing type has acidentally been dropped. As a result, the range of the parameters was not guaranteed to be bounded any more. This change makes sure we always add the bounds given by the size of the type and then additionally add bounds based on signed wrapping, if available. For a parameter p with a type size of 32 bit, the valid range is then: { -2147483648 <= p <= 2147483647 and (255 < p or p < 0) } llvm-svn: 295349	2017-02-16 18:39:14 +00:00
Roman Gareev	4eb07e481e	[FIX] Fix the typo in ScheduleOptimizer.cpp. llvm-svn: 295292	2017-02-16 07:04:41 +00:00
Michael Kruse	c28c584604	[DeLICM] Add forgotten unittests in previous commit. NFC. llvm-svn: 295204	2017-02-15 17:19:22 +00:00
Michael Kruse	e23e94a08d	[DeLICM] Add Knowledge class. NFC. The Knowledge class remembers the state of data at any timepoint of a SCoP's execution. Currently, it tracks whether an array element is unused or is occupied by some value, and the writes to it. A future addition will be to also remember which value it contains. Objects are used to determine whether two Knowledge contain conflicting information, i.e. two states cannot be true a the same time. This commit was extracted from the DeLICM algorithm at https://reviews.llvm.org/D24716. llvm-svn: 295197	2017-02-15 16:59:10 +00:00
Tobias Grosser	288c450cf6	[ScopDetectDiagnostics] Do not format unnamed array names Formatting unnamed array names is expensive in LLVM as the this requires deriving the numbered virtual instruction name (e.g., %12) for an llvm::Value, which is currently not implemented efficiently. As instruction numberes anyhow do not really carry a lot of information for the user, we just print 'unknown' instead. This change reduces the scop detection time from 24 to 19 seconds, for one of our large-scale inputs. This is a reduction by 21%. llvm-svn: 294894	2017-02-12 10:53:02 +00:00
Tobias Grosser	9fe37df27c	[ScopDetection] Add statistics to count the maximal number of scops in loop llvm-svn: 294893	2017-02-12 10:52:57 +00:00
Tobias Grosser	b3a85884f7	Do not use wrapping ranges to bound non-affine accesses When deriving the range of valid values of a scalar evolution expression might be a range [12, 8), where the upper bound is smaller than the lower bound and where the range is expected to possibly wrap around. We theoretically could model such a range as a union of two non-wrapping ranges, but do not do this as of yet. Instead, we just do not derive any bounds. Before this change, we could have obtained bounds where the maximal possible value is strictly smaller than the minimal possible value, which is incorrect and also caused assertions during scop modeling. llvm-svn: 294891	2017-02-12 08:11:12 +00:00
Roman Gareev	b196055c0c	Check reduction dependencies in case of the matrix multiplication optimization To determine parameters of the matrix multiplication, we check RAW dependencies that can be expressed using only reduction dependencies. Consequently, we should check the reduction dependencies, if this is the case. Reviewed-by: Tobias Grosser <tobias@grosser.es>, Sven Verdoolaege <skimo-polly@kotnet.org> Michael Kruse <llvm@meinersbur.de> Differential Revision: https://reviews.llvm.org/D29814 llvm-svn: 294836	2017-02-11 09:59:09 +00:00
Roman Gareev	de69293b01	[FIX] Fix the potential issue of containsOnlyMatMulDep. llvm-svn: 294835	2017-02-11 09:48:09 +00:00
Roman Gareev	5ef7e210c0	[NFC] Fix the style issue of lib/Transform/ScheduleOptimizer.cpp. llvm-svn: 294834	2017-02-11 08:43:41 +00:00
Roman Gareev	afcf026d81	[NFC] Fix style issues of lib/Transform/ScheduleOptimizer.cpp. llvm-svn: 294831	2017-02-11 07:14:37 +00:00
Roman Gareev	3d4eae31ea	Use the size of the widest type of the matrix multiplication operands The size of the operands type is the one of the parameters required to determine the BLIS micro-kernel. We get the size of the widest type of the matrix multiplication operands in case there are several different types. Reviewed-by: Michael Kruse <llvm@meinersbur.de> Differential Revision: https://reviews.llvm.org/D29269 llvm-svn: 294828	2017-02-11 07:00:05 +00:00
Tobias Grosser	30a02088c0	Porting the example illustrating Polly from HTML to reStructuredText http://polly.llvm.org/example_manual_matmul.html which illustrates individual passes of Polly, has been ported to reStructuredText and necessary changes have been made to the configuration files used by SPHINX to include the new source as a part of the documentation. Contributed-by: Singapuram Sanjay Srivallabh <singapuram.sanjay@gmail.com> Differential Revision: https://reviews.llvm.org/D25163 llvm-svn: 294735	2017-02-10 11:46:57 +00:00
Tobias Grosser	296fe2e2ad	[ScopInfo] Use original base address when building ScopArrayInfo [NFC] This change clarfies that we want to indeed use the original base address when creating the ScopArrayInfo that corresponds to a given memory access. This change prepares for https://reviews.llvm.org/D28518. llvm-svn: 294734	2017-02-10 10:09:46 +00:00
Tobias Grosser	5db171a9da	[ScopInfo] Use getAccessValue to obtain the accessed value This replaces the use of getOriginalAddrPtr, a value that is stored in ScopArrayInfo and might at some point not be unique any more. However, the access value is defined to be unique. This change is an update on r294576, which only clarified that we need the original memory access, but where we still remained dependent to have one base pointer per scop. This change removes unnecessary uses of MemoryAddress::getOriginalBaseAddr() in preparation for https://reviews.llvm.org/D28518. llvm-svn: 294733	2017-02-10 10:09:44 +00:00
Tobias Grosser	583be06fb2	[BlockGenerator] Use MemoryAccess::getAccessValue to get load instruction When generating code in the BlockGenerator we copy all (interesting) instructions and keep track of the new values in a basic block map. To obtain the original llvm::Value that belongs to a load memory access, we use getAccessValue() instead of getOriginalBaseAddr(). The former always references the instruction we use to load values from. The latter, on the other hand, is obtaine from the corresponding ScopArrayInfo and would not be unique in case ScopArrayInfo objects at some point allow memory accesses with different base addresses. This change is an update on r294566, which only clarified that we need the original memory access, but where we still remained dependent to have one base pointer per scop. This change removes unnecessary uses of MemoryAddress::getOriginalBaseAddr() in preparation for https://reviews.llvm.org/D28518. llvm-svn: 294669	2017-02-09 23:54:23 +00:00
Tobias Grosser	e24b7b929d	[ScopInfo] Use MemoryAccess::getScopArrayInfo() interface to access Array [NFC] By using the public interface MemoryAccess::getScopArrayInfo() we avoid the direct access to the ScopArrayInfoMap and as a result also do not need to use the BasePtr as key. This change makes the code cleaner. The const-cast we introduce is a little ugly. We may consider to drop const correctness for getScopArrayInfo() at some point. This change removes unnecessary uses of MemoryAddress::getBaseAddr() in preparation for https://reviews.llvm.org/D28518. llvm-svn: 294655	2017-02-09 23:24:57 +00:00
Tobias Grosser	9c7d181c92	[ScopInfo] Use types instead of 'auto' and use more descriptive variable names [NFC] LLVM's coding conventions suggest to use auto only in obvious cases. Hence, we move this code to actually declare the types used. We also replace the variable name 'SAI', with the name 'Array', as this improves readability. llvm-svn: 294654	2017-02-09 23:24:54 +00:00
Tobias Grosser	889830b1c5	[ScopInfo] Use ScopArrayInfo instead of base address When building alias groups, we sort different ScopArrays into unrelated groups. Historically we identified arrays through their base pointer, as no ScopArrayInfo class was yet available. This change changes the alias group construction to reference arrays through their ScopArrayInfo object. This change removes unnecessary uses of MemoryAddress::getBaseAddr() in preparation for https://reviews.llvm.org/D28518. llvm-svn: 294649	2017-02-09 23:12:22 +00:00
Tobias Grosser	be372d5a04	[ScopInfo] Expect the OriginalBaseAddr when looking at underlying instructions [NFC] During SCoP construction we sometimes inspect the underlying IR by looking at the base address of a MemoryAccess. In such cases, we always want the original base address. Make this clear by calling getOriginalBaseAddr(). This is a non-functional change as getBaseAddr maps to getOriginalBaseAddr at the moment. This change removes unnecessary uses of MemoryAddress::getBaseAddr() in preparation for https://reviews.llvm.org/D28518. llvm-svn: 294576	2017-02-09 10:11:58 +00:00
Tobias Grosser	e0e0e4d4f6	[ScopInfo] Remove unnecessary indirection through SCEV [NFC] The base address of a memory access is already an llvm::Value. Hence, there is no need to go through SCEV, but we can directly work with the llvm::Value. Also use 'Value *' instead of 'auto' for cases where the type is not obvious. llvm-svn: 294575	2017-02-09 09:34:46 +00:00
Tobias Grosser	4553463be4	[IRBuilder] Extract base pointers directly from ScopArray Instead of iterating over statements and their memory accesses to extract the set of available base pointers, just directly iterate over all ScopArray objects. This reflects more the actual intend of the code: collect all arrays (and their base pointers) to emit alias information that specifies that accesses to different arrays cannot alias. This change removes unnecessary uses of MemoryAddress::getBaseAddr() in preparation for https://reviews.llvm.org/D28518. llvm-svn: 294574	2017-02-09 09:34:42 +00:00
Roman Gareev	028ba3702c	[FIX] Disable the problematic run lines There are problems with using the machine information to derive the precise vector size on polly-amd64-linux and polly-arm-linux. We temporarily disable the problematic run lines. llvm-svn: 294571	2017-02-09 09:03:13 +00:00
Roman Gareev	2d0d294e3c	[FIX] Specify the CPU to overwrite the machine info and set a fixed vector size. llvm-svn: 294569	2017-02-09 08:29:55 +00:00
Tobias Grosser	26fb7d7517	[IslAst] Print the ScopArray name to mark reductions Before this change we used the name of the base pointer to mark reductions. This is imprecise as the canonical reference is the ScopArray itself and not the basepointer of a reduction. Using the base pointer of reductions is problematic in cases where a single ScopArray is referenced through two different base pointers. This change removes unnecessary uses of MemoryAddress::getBaseAddr() in preparation for https://reviews.llvm.org/D28518. llvm-svn: 294568	2017-02-09 08:06:15 +00:00
Tobias Grosser	114f6d6ff5	[DependenceInfo] Use ScopArrayInfo to keep track of arrays [NFC] When computing reduction dependences we first identify all ScopArrays which are part of reductions and then only compute for these ScopArrays the more detailed data dependences that allow us to identify reductions and optimize across them. Instead of using the base pointer as identifier of a ScopArray, it is clearer and more understandable to directly use the ScopArray as identifier. This change implements such a switch. This change removes unnecessary uses of MemoryAddress::getBaseAddr() in preparation for https://reviews.llvm.org/D28518. llvm-svn: 294567	2017-02-09 08:06:05 +00:00
Tobias Grosser	02400a0e0c	[BlockGenerator] BBMap uses original BaseAddress for scalar loads [NFC] When regenerating code in the BlockGenerator we copy instructions that may references scalar values, for which the new value of a given scalar is looked up in BBMap using the original scalar llvm::Value as index. It is consequently necessary that (re)loaded scalar values are made available in BBMap using the original llvm::Value as key independently if the llvm::Value was (re)loaded from the original scalar or a new access function has been specified that caused the value to be reloaded from an array with a differnet base address. We make this clear by using MemoryAccess::getOriginalBaseAddr() instead of MemoryAccess::getBaseAddr() as index to BBMap. This change removes unnecessary uses of MemoryAddress::getBaseAddr() in preparation for https://reviews.llvm.org/D28518. llvm-svn: 294566	2017-02-09 08:05:50 +00:00
Roman Gareev	9989088ee9	Isolate a set of partial tile prefixes in case of the matrix multiplication optimization Isolate a set of partial tile prefixes to allow hoisting and sinking out of the unrolled innermost loops produced by the optimization of the matrix multiplication. In case it cannot be proved that the number of loop iterations can be evenly divided by tile sizes and we tile and unroll the point loop, the isl generates conditional expressions. Subsequently, the conditional expressions can prevent stores and loads of the unrolled loops from being sunk and hoisted. The patch isolates a set of partial tile prefixes, which have exactly Mr x Nr iterations of the two innermost loops, the result of the loop tiling performed by the matrix multiplication optimization, where Mr and Mr are parameters of the micro-kernel. This helps to get rid of the conditional expressions of the unrolled innermost loops. Probably this approach can be replaced with padding in future. In case of, for example, the gemm from Polybench/C 3.2 and parametric loop bounds, it helps to increase the performance from 7.98 GFlops (27.71% of theoretical peak) to 21.47 GFlops (74.57% of theoretical peak). Hence, we get the same performance as in case of scalar loops bounds. It also cause compile time regression. The compile-time is increased from 0.795 seconds to 0.837 seconds in case of scalar loops bounds and from 1.222 seconds to 1.490 seconds in case of parametric loops bounds. Reviewed-by: Michael Kruse <llvm@meinersbur.de> Differential Revision: https://reviews.llvm.org/D29244 llvm-svn: 294564	2017-02-09 07:10:01 +00:00
Roman Gareev	772498dc68	[NFC] Make ScheduleTreeOptimizer::optimizeBand return a schedule node optimized with optimizeMatMulPattern This patch makes ScheduleTreeOptimizer::optimizeBand return a schedule node optimized with optimizeMatMulPattern. Otherwise, it could not use the isolate option, because standardBandOpts could try to tile a band node with anchored subtree and get the error, since the use of the isolate option causes any tree containing the node to be considered anchored. Furthermore, it is not intended to apply standard optimizations, when the matrix multiplication has been detected. llvm-svn: 294444	2017-02-08 13:29:06 +00:00
Michael Kruse	49c21222a0	[External] Move lib/JSON to lib/External/JSON. NFC. For consistency with isl and ppcg which are already in lib/External. llvm-svn: 294126	2017-02-05 15:26:56 +00:00
Michael Kruse	acb08aaed5	[Support] Add convertZoneToTimepoints. NFC. This function has been extracted from the upcoming DeLICM patch (https://reviews.llvm.org/D24716). In contrast to computeReachingWrite and computeArrayUnused, convertZoneToTimepoints implies a format for zones (ranges between timepoints). Zones at the moment are unique to DeLICM, but convertZoneToTimepoints makes most sense in conjunction with the previous two functions. llvm-svn: 294094	2017-02-04 15:42:17 +00:00
Michael Kruse	ec67d36493	[Support] Add computeArrayUnused. NFC. This function has been extracted from the upcoming DeLICM patch (https://reviews.llvm.org/D24716). llvm-svn: 294093	2017-02-04 15:42:10 +00:00
Michael Kruse	f4dc133e69	[Support] Add computeReachingWrite. NFC. This function has been extracted from the upcoming DeLICM patch (https://reviews.llvm.org/D24716). llvm-svn: 294092	2017-02-04 15:42:01 +00:00

1 2 3 4 5 ...

2907 Commits