llvm-project

Commit Graph

Author	SHA1	Message	Date
Michael Kruse	d8d32bb3d1	[DeLICM] Regression test for skipping map targets. Add optimization-remarks-missed for when mapping targets have been skipped and add regression tests for them. llvm-svn: 295953	2017-02-23 10:25:20 +00:00
Michael Kruse	deb30e8278	[DeLICM] Add regression tests for DeLICM reject cases. These tests were not included in the main DeLICM commit. These check the cases where zone analysis cannot be successful because of assumption violations. We use the LLVM optimization remark infrastructure as it seems to be the best fit for this kind of messages. I tried to make use if the OptimizationRemarkEmitter. However, it would insert additional function passes into the pass manager to get the hotness information. The pass manager would insert them between the flatten pass and delicm, causing the ScopInfo with the flattened schedule being thrown away. Differential Revision: https://reviews.llvm.org/D30253 llvm-svn: 295846	2017-02-22 15:14:08 +00:00
Michael Kruse	8474470500	[DeLICM] Fix wrong comment. NFC. Correct a comment that claimed that a store after load was detected when the code checks a load after a store. llvm-svn: 295835	2017-02-22 14:14:40 +00:00
Michael Kruse	43ed25f1d9	[DeLICM] Print message when zone analysis is not available on -analysis. This is to distinguish the cases that analysis has failed from the case where not transformation was performed. llvm-svn: 295833	2017-02-22 13:48:35 +00:00
Michael Kruse	91cdafb86f	[DeLICM] Use opt<int>. There is no template specialization for cl::parser<unsigned long> such that parsing an cl::opt<unsigned long> command line argument will fail. Use opt<int> instead which has an associated parser. llvm-svn: 295832	2017-02-22 13:48:18 +00:00
Michael Kruse	9e52c39f0a	[DeLICM] Map values hoisted by LICM back to the array. Implement the -polly-delicm pass. The pass intends to undo the effects of LoopInvariantCodeMotion (LICM) which adds additional scalar dependencies into SCoPs. DeLICM will try to map those scalars back to the array elements they were promoted from, as long as the array element is unused. The is the main patch from the DeLICM/DePRE patch series. It does not yet undo GVN PRE for which additional information about known values is needed and does not handle PHI write accesses that have have no target. As such its usefulness is limited. Patches for these issues including regression tests for error situatons will follow. Reviewers: grosser Differential Revision: https://reviews.llvm.org/D24716 llvm-svn: 295713	2017-02-21 10:20:54 +00:00
Roman Gareev	4eb07e481e	[FIX] Fix the typo in ScheduleOptimizer.cpp. llvm-svn: 295292	2017-02-16 07:04:41 +00:00
Michael Kruse	e23e94a08d	[DeLICM] Add Knowledge class. NFC. The Knowledge class remembers the state of data at any timepoint of a SCoP's execution. Currently, it tracks whether an array element is unused or is occupied by some value, and the writes to it. A future addition will be to also remember which value it contains. Objects are used to determine whether two Knowledge contain conflicting information, i.e. two states cannot be true a the same time. This commit was extracted from the DeLICM algorithm at https://reviews.llvm.org/D24716. llvm-svn: 295197	2017-02-15 16:59:10 +00:00
Roman Gareev	b196055c0c	Check reduction dependencies in case of the matrix multiplication optimization To determine parameters of the matrix multiplication, we check RAW dependencies that can be expressed using only reduction dependencies. Consequently, we should check the reduction dependencies, if this is the case. Reviewed-by: Tobias Grosser <tobias@grosser.es>, Sven Verdoolaege <skimo-polly@kotnet.org> Michael Kruse <llvm@meinersbur.de> Differential Revision: https://reviews.llvm.org/D29814 llvm-svn: 294836	2017-02-11 09:59:09 +00:00
Roman Gareev	de69293b01	[FIX] Fix the potential issue of containsOnlyMatMulDep. llvm-svn: 294835	2017-02-11 09:48:09 +00:00
Roman Gareev	5ef7e210c0	[NFC] Fix the style issue of lib/Transform/ScheduleOptimizer.cpp. llvm-svn: 294834	2017-02-11 08:43:41 +00:00
Roman Gareev	afcf026d81	[NFC] Fix style issues of lib/Transform/ScheduleOptimizer.cpp. llvm-svn: 294831	2017-02-11 07:14:37 +00:00
Roman Gareev	3d4eae31ea	Use the size of the widest type of the matrix multiplication operands The size of the operands type is the one of the parameters required to determine the BLIS micro-kernel. We get the size of the widest type of the matrix multiplication operands in case there are several different types. Reviewed-by: Michael Kruse <llvm@meinersbur.de> Differential Revision: https://reviews.llvm.org/D29269 llvm-svn: 294828	2017-02-11 07:00:05 +00:00
Roman Gareev	9989088ee9	Isolate a set of partial tile prefixes in case of the matrix multiplication optimization Isolate a set of partial tile prefixes to allow hoisting and sinking out of the unrolled innermost loops produced by the optimization of the matrix multiplication. In case it cannot be proved that the number of loop iterations can be evenly divided by tile sizes and we tile and unroll the point loop, the isl generates conditional expressions. Subsequently, the conditional expressions can prevent stores and loads of the unrolled loops from being sunk and hoisted. The patch isolates a set of partial tile prefixes, which have exactly Mr x Nr iterations of the two innermost loops, the result of the loop tiling performed by the matrix multiplication optimization, where Mr and Mr are parameters of the micro-kernel. This helps to get rid of the conditional expressions of the unrolled innermost loops. Probably this approach can be replaced with padding in future. In case of, for example, the gemm from Polybench/C 3.2 and parametric loop bounds, it helps to increase the performance from 7.98 GFlops (27.71% of theoretical peak) to 21.47 GFlops (74.57% of theoretical peak). Hence, we get the same performance as in case of scalar loops bounds. It also cause compile time regression. The compile-time is increased from 0.795 seconds to 0.837 seconds in case of scalar loops bounds and from 1.222 seconds to 1.490 seconds in case of parametric loops bounds. Reviewed-by: Michael Kruse <llvm@meinersbur.de> Differential Revision: https://reviews.llvm.org/D29244 llvm-svn: 294564	2017-02-09 07:10:01 +00:00
Roman Gareev	772498dc68	[NFC] Make ScheduleTreeOptimizer::optimizeBand return a schedule node optimized with optimizeMatMulPattern This patch makes ScheduleTreeOptimizer::optimizeBand return a schedule node optimized with optimizeMatMulPattern. Otherwise, it could not use the isolate option, because standardBandOpts could try to tile a band node with anchored subtree and get the error, since the use of the isolate option causes any tree containing the node to be considered anchored. Furthermore, it is not intended to apply standard optimizations, when the matrix multiplication has been detected. llvm-svn: 294444	2017-02-08 13:29:06 +00:00
Roman Gareev	98075fe181	A new algorithm for identification of a SCoP statement that implement a matrix multiplication The current identification of a SCoP statement that implement a matrix multiplication does not help to identify different permutations of loops that contain it and check for dependencies, which can prevent it from being optimized. It also requires external determination of the operands of the matrix multiplication. This patch contains the implementation of a new algorithm that helps to avoid these issues. It also modifies the test cases that generate matrix multiplications with linearized accesses, because the new algorithm does not support them. Reviewed-by: Michael Kruse <llvm@meinersbur.de>, Tobias Grosser <tobias@grosser.es> Differential Revision: https://reviews.llvm.org/D28357 llvm-svn: 293890	2017-02-02 14:23:14 +00:00
Tobias Grosser	ff40087a6a	Update to recent formatting changes llvm-svn: 293756	2017-02-01 10:12:09 +00:00
Roman Gareev	7758a2af53	Update the documentation on how the packing transformation is implemented Add a simple example to update the documentation on how the packing transformation is implemented. Reviewed-by: Tobias Grosser <tobias@grosser.es>, Michael Kruse <llvm@meinersbur.de> Differential Revision: https://reviews.llvm.org/D28021 llvm-svn: 293429	2017-01-29 10:37:50 +00:00
Michael Kruse	33dc454700	[CodePrepa] Remove unused declaration. NFC. llvm-svn: 293304	2017-01-27 16:59:09 +00:00
Tobias Grosser	21a059af09	Adjust formatting to commit r292110 [NFC] llvm-svn: 292123	2017-01-16 14:08:10 +00:00
Tobias Grosser	67e94fb435	ScheduleOptimizer: Allow to set register width in command line We use this option to set a fixed register width in our test cases to make sure the results are identical accross platforms. llvm-svn: 292002	2017-01-14 07:14:54 +00:00
Roman Gareev	1c2927b209	Specify the default values of the cache parameters If the parameters of the target cache (i.e., cache level sizes, cache level associativities) are not specified or have wrong values, we use ones for parameters of the macro-kernel and do not perform data-layout optimizations of the matrix multiplication. In this patch we specify the default values of the cache parameters to be able to apply the pattern matching optimizations even in this case. Since there is no typical values of this parameters, we use the parameters of Intel Core i7-3820 SandyBridge that also help to attain the high-performance on IBM POWER System S822 and IBM Power 730 Express server. Reviewed-by: Tobias Grosser <tobias@grosser.es> Differential Revision: https://reviews.llvm.org/D28090 llvm-svn: 290518	2016-12-25 16:32:28 +00:00
Tobias Grosser	0791d5f5aa	ScheduleOptimizer: Fix spelling of option '-polly-target-throughput-vector-fma' througput -> throughput llvm-svn: 290418	2016-12-23 07:33:39 +00:00
Roman Gareev	be5299af0b	Change the determination of parameters of macro-kernel Typically processor architectures do not include an L3 cache, which means that Nc, the parameter of the micro-kernel, is, for all practical purposes, redundant ([1]). However, its small values can cause the redundant packing of the same elements of the matrix A, the first operand of the matrix multiplication. At the same time, big values of the parameter Nc can cause segmentation faults in case the available stack is exceeded. This patch adds an option to specify the parameter Nc as a multiple of the parameter of the micro-kernel Nr. In case of Intel Core i7-3820 SandyBridge and the following options, clang -O3 gemm.c -I utilities/ utilities/polybench.c -DPOLYBENCH_TIME -march=native -mllvm -polly -mllvm -polly-pattern-matching-based-opts=true -DPOLYBENCH_USE_SCALAR_LB -mllvm -polly-target-cache-level-associativity=8,8 -mllvm -polly-target-cache-level-sizes=32768,262144 -mllvm -polly-target-latency-vector-fma=8 it helps to improve the performance from 11.303 GFlops/sec (39,247% of theoretical peak) to 17.896 GFlops/sec (62,14% of theoretical peak). Refs.: [1] - http://www.cs.utexas.edu/users/flame/pubs/TOMS-BLIS-Analytical.pdf Reviewed-by: Tobias Grosser <tobias@grosser.es> Differential Revision: https://reviews.llvm.org/D28019 llvm-svn: 290256	2016-12-21 12:51:12 +00:00
Roman Gareev	92c446016a	[Polly] Use three-dimensional arrays to store packed operands of the matrix multiplication Previously we had two-dimensional accesses to store packed operands of the matrix multiplication for the sake of simplicity of the packed arrays. However, addition of the third dimension helps to simplify the corresponding memory access, reduce the execution time of isl operations applied to it, and consequently reduce the compile-time of Polly. For example, in case of Intel Core i7-3820 SandyBridge and the following options, clang -O3 gemm.c -I utilities/ utilities/polybench.c -DPOLYBENCH_TIME -march=native -mllvm -polly -mllvm -polly-pattern-matching-based-opts=true -DPOLYBENCH_USE_SCALAR_LB -mllvm -polly-target-cache-level-associativity=8,8 -mllvm -polly-target-cache-level-sizes=32768,262144 -mllvm -polly-target-latency-vector-fma=7 it helps to reduce the compile-time from about 361.456 seconds to about 0.816 seconds. Reviewed-by: Michael Kruse <llvm@meinersbur.de>, Tobias Grosser <tobias@grosser.es> Differential Revision: https://reviews.llvm.org/D27878 llvm-svn: 290251	2016-12-21 11:18:42 +00:00
Roman Gareev	2606c48a1d	Restrict ranges of extension maps To prevent copy statements from accessing arrays out of bounds, ranges of their extension maps are restricted, according to the constraints of domains. Reviewed-by: Michael Kruse <llvm@meinersbur.de> Differential Revision: https://reviews.llvm.org/D25655 llvm-svn: 289815	2016-12-15 12:35:59 +00:00
Roman Gareev	15db81ef71	[NFC] Fix typos in getMacroKernelParams. llvm-svn: 289808	2016-12-15 12:00:57 +00:00
Roman Gareev	8babe1a216	The order of the loops defines the data reused in the BLIS implementation of gemm ([1]). In particular, elements of the matrix B, the second operand of matrix multiplication, are reused between iterations of the innermost loop. To keep the reused data in cache, only elements of matrix A, the first operand of matrix multiplication, should be evicted during an iteration of the innermost loop. To provide such a cache replacement policy, elements of the matrix A can, in particular, be loaded first and, consequently, be least-recently-used. In our case matrices are stored in row-major order instead of column-major order used in the BLIS implementation ([1]). One of the ways to address it is to accordingly change the order of the loops of the loop nest. However, it makes elements of the matrix A to be reused in the innermost loop and, consequently, requires to load elements of the matrix B first. Since the LLVM vectorizer always generates loads from the matrix A before loads from the matrix B and we can not provide it. Consequently, we only change the BLIS micro kernel and the computation of its parameters instead. In particular, reused elements of the matrix B are successively multiplied by specific elements of the matrix A . Refs.: [1] - http://www.cs.utexas.edu/users/flame/pubs/TOMS-BLIS-Analytical.pdf Reviewed-by: Tobias Grosser <tobias@grosser.es> Differential Revision: https://reviews.llvm.org/D25653 llvm-svn: 289806	2016-12-15 11:47:38 +00:00
Michael Kruse	79c0173f53	[ScheduleOptimizer] Fix memory leak. NFC. llvm-svn: 289434	2016-12-12 14:51:06 +00:00
Johannes Doerfert	2df9963fe3	Rerun mem2reg after the inliner It did happen that after the inliner finished we end up with promotable allocas in a function. We now run mem2reg to make sure everything is promoted if possible. llvm-svn: 288514	2016-12-02 17:43:57 +00:00
Michael Kruse	36e79ecaec	[DeLICM] Add pass boilerplate code. Add an empty DeLICM pass, without any functional parts. Extracting the boilerplate from the the functional part reduces the size of the code to review (https://reviews.llvm.org/D24716) Suggested-by: Tobias Grosser <tobias@grosser.es> llvm-svn: 288160	2016-11-29 16:41:21 +00:00
Roman Gareev	b3224adfb6	Perform copying to created arrays according to the packing transformation This is the fourth patch to apply the BLIS matmul optimization pattern on matmul kernels (http://www.cs.utexas.edu/users/flame/pubs/TOMS-BLIS-Analytical.pdf). BLIS implements gemm as three nested loops around a macro-kernel, plus two packing routines. The macro-kernel is implemented in terms of two additional loops around a micro-kernel. The micro-kernel is a loop around a rank-1 (i.e., outer product) update. In this change we perform copying to created arrays, which is the last step to implement the packing transformation. Reviewed-by: Tobias Grosser <tobias@grosser.es> Differential Revision: https://reviews.llvm.org/D23260 llvm-svn: 281441	2016-09-14 06:26:09 +00:00
Roman Gareev	f5aff70405	Store the size of the outermost dimension in case of newly created arrays that require memory allocation. We do not need the size of the outermost dimension in most cases, but if we allocate memory for newly created arrays, that size is needed. Reviewed-by: Michael Kruse <llvm@meinersbur.de> Differential Revision: https://reviews.llvm.org/D23991 llvm-svn: 281234	2016-09-12 17:08:31 +00:00
Tobias Grosser	5aea5653b3	FlattenAlgo: Ensure we _really_ obtain a param space This resolves "isl_space.c:1775: not a parameter space" errors I have seen on two systems. llvm-svn: 281052	2016-09-09 16:11:26 +00:00
Michael Kruse	7886bd7ca5	Add -polly-flatten-schedule pass. The -polly-flatten-schedule pass reduces the number of scattering dimensions in its isl_union_map form to make them easier to understand. It is not meant to be used in production, only for debugging and regression tests. To illustrate, how it can make sets simpler, here is a lifetime set used computed by the porposed DeLICM pass without flattening: { Stmt_reduction_for[0, 4] -> [0, 2, o2, o3] : o2 < 0; Stmt_reduction_for[0, 4] -> [0, 1, o2, o3] : o2 >= 5; Stmt_reduction_for[0, 4] -> [0, 1, 4, o3] : o3 > 0; Stmt_reduction_for[0, i1] -> [0, 1, i1, 1] : 0 <= i1 <= 3; Stmt_reduction_for[0, 4] -> [0, 2, 0, o3] : o3 <= 0 } And here the same lifetime for a semantically identical one-dimensional schedule: { Stmt_reduction_for[0, i1] -> [2 + 3i1] : 0 <= i1 <= 4 } Differential Revision: https://reviews.llvm.org/D24310 llvm-svn: 280948	2016-09-08 15:02:36 +00:00
Tobias Grosser	c80d6979bd	Drop '@brief' from doxygen comments LLVM's coding guideline suggests to not use @brief for one-sentence doxygen comments to improve readability. Switch this once and for all to ensure people do not copy @brief comments from other parts of Polly, when writing new code. llvm-svn: 280468	2016-09-02 06:33:33 +00:00
Roman Gareev	5f99f8656e	Add a flag to dump SCoP optimized with the IslScheduleOptimizer pass Dump polyhedral descriptions of Scops optimized with the isl scheduling optimizer and the set of post-scheduling transformations applied on the schedule tree to be able to check the work of the IslScheduleOptimizer pass at the polyhedral level. Reviewed-by: Tobias Grosser <tobias@grosser.es> Differential Revision: https://reviews.llvm.org/D23740 llvm-svn: 279395	2016-08-21 11:20:39 +00:00
Roman Gareev	1c892e91e3	Perform replacement of access relations and creation of new arrays according to the packing transformation This is the third patch to apply the BLIS matmul optimization pattern on matmul kernels (http://www.cs.utexas.edu/users/flame/pubs/TOMS-BLIS-Analytical.pdf). BLIS implements gemm as three nested loops around a macro-kernel, plus two packing routines. The macro-kernel is implemented in terms of two additional loops around a micro-kernel. The micro-kernel is a loop around a rank-1 (i.e., outer product) update. In this change we perform replacement of the access relations and create empty arrays, which are steps to implement the packing transformation. In subsequent changes we will implement copying to created arrays. Reviewed-by: Tobias Grosser <tobias@grosser.es> Differential Revision: http://reviews.llvm.org/D22187 llvm-svn: 278666	2016-08-15 12:22:54 +00:00
Tobias Grosser	2219d15748	Fix a couple of spelling mistakes llvm-svn: 277569	2016-08-03 05:28:09 +00:00
Roman Gareev	3a18a931a8	Apply all necessary tilings and interchangings to get a macro-kernel This is the second patch to apply the BLIS matmul optimization pattern on matmul kernels (http://www.cs.utexas.edu/users/flame/pubs/TOMS-BLIS-Analytical.pdf). BLIS implements gemm as three nested loops around a macro-kernel, plus two packing routines. The macro-kernel is implemented in terms of two additional loops around a micro-kernel. The micro-kernel is a loop around a rank-1 (i.e., outer product) update. In this change we create the BLIS macro-kernel by applying a combination of tiling and interchanging. In subsequent changes we will implement the packing transformation. Reviewed-by: Tobias Grosser <tobias@grosser.es> Differential Revision: http://reviews.llvm.org/D21491 llvm-svn: 276627	2016-07-25 09:42:53 +00:00
Roman Gareev	2cb4d133f5	[NFC] Refactor creation of the BLIS mirco-kernel and improve documentation Reviewed-by: Tobias Grosser <tobias@grosser.es> llvm-svn: 276616	2016-07-25 07:27:59 +00:00
Tobias Grosser	3898a0468c	Propagate on-error status This ensures that the error status set with -polly-on-isl-error-abort is maintained even after running DependenceInfo and ScheduleOptimizer. Both passes temporarily set the error status to CONTINUE as the dependence analysis uses a compute-out and the scheduler may not be able to derive a schedule. In both cases we want to not abort, but to handle the error gracefully. Before this commit, we always set the error reporting to ABORT after these passes. After this commit, we use the error reporting mode that was active earlier. This comes without a test case as this would require us to introduce (memory) errors which would trigger the isl errors. llvm-svn: 274272	2016-06-30 20:42:58 +00:00
Tobias Grosser	af14993016	Simplify: get isl_ctx only once [NFC] ... instead of call S.getIslCtx() many times. llvm-svn: 274271	2016-06-30 20:42:56 +00:00
Tobias Grosser	522478d2c0	clang-tidy: Add llvm namespace comments llvm commonly adds a comment to the closing brace of a namespace to indicate which namespace is closed. clang-tidy provides with llvm-namespace-comment a handy tool to check for this habit. We use it to ensure we consitently use namespace comments in Polly. There are slightly different styles in how namespaces are closed in LLVM. As there is no large difference between the different comment styles we go for the style clang-tidy suggests by default. To reproduce this fix run: for i in `ls tools/polly/lib//.cpp`; \ clang-tidy -checks='-,llvm-namespace-comment' -p build $i -fix \ -header-filter="."; \ done This cleanup was suggested by Eugene Zelenko <eugene.zelenko@gmail.com> in http://reviews.llvm.org/D21488 and was split out to increase readability. llvm-svn: 273621	2016-06-23 22:17:27 +00:00
Tobias Grosser	1a1056798b	Fix separator in header comment This cleanup was suggested by Eugene Zelenko <eugene.zelenko@gmail.com> in http://reviews.llvm.org/D21488 and was split out to increase readability. llvm-svn: 273437	2016-06-22 16:29:33 +00:00
Tobias Grosser	8dd653d983	clang-tidy: apply modern-use-nullptr fixes Instead of using 0 or NULL use the C++11 nullptr symbol when referencing null pointers. This cleanup was suggested by Eugene Zelenko <eugene.zelenko@gmail.com> in http://reviews.llvm.org/D21488 and was split out to increase readability. llvm-svn: 273435	2016-06-22 16:22:00 +00:00
Roman Gareev	397a34a08d	[NFC] Use isl_schedule_node_band_n_member to get the number of dimensions of a band node. llvm-svn: 273400	2016-06-22 12:11:30 +00:00
Roman Gareev	42402c9e89	Apply all necessary tilings and unrollings to get a micro-kernel This is the first patch to apply the BLIS matmul optimization pattern on matmul kernels (http://www.cs.utexas.edu/users/flame/pubs/TOMS-BLIS-Analytical.pdf). BLIS implements gemm as three nested loops around a macro-kernel, plus two packing routines. The macro-kernel is implemented in terms of two additional loops around a micro-kernel. The micro-kernel is a loop around a rank-1 (i.e., outer product) update. In this change we create the BLIS micro-kernel by applying a combination of tiling and unrolling. In subsequent changes we will add the extraction of the BLIS macro-kernel and implement the packing transformation. Contributed-by: Roman Gareev <gareevroman@gmail.com> Reviewed-by: Tobias Grosser <tobias@grosser.es> Differential Revision: http://reviews.llvm.org/D21140 llvm-svn: 273397	2016-06-22 09:52:37 +00:00
Roman Gareev	b17b9a8324	[NFC] Outline the application of register tiling. llvm-svn: 272515	2016-06-12 17:20:05 +00:00
Roman Gareev	827264de98	[NFC] "#include <ciso646>" is unnecessary, because "and", "or" were replaced by "&&", "\|\|". llvm-svn: 272168	2016-06-08 16:44:11 +00:00
Roman Gareev	ba0fb97c0a	[NFC] Check that a parameter of ScheduleTreeOptimizer::isMatrMultPattern contains a correct partial schedule llvm-svn: 271780	2016-06-04 06:34:04 +00:00
Roman Gareev	4b8c7aeb62	[FIX] Fix potential issue related to subtraction from an unsigned 0 in circularShiftOutputDims Reported-by: Mehdi Amini <mehdi.amini@apple.com> Contributed-by: Michael Kruse <llvm@meinersbur.de> Differential Revision: http://reviews.llvm.org/D20969 llvm-svn: 271705	2016-06-03 18:46:29 +00:00
Roman Gareev	76614d3ed9	[GSoC 2016] [Polly] [FIX] Determination of statements that contain matrix multiplication Fix small issues related to characters, operators and descriptions of tests. Differential Revision: http://reviews.llvm.org/D20806 llvm-svn: 271264	2016-05-31 11:22:21 +00:00
Johannes Doerfert	99191c78c2	Decouple SCoP building logic from pass Created a new pass ScopInfoRegionPass. As name suggests, it is a region pass and it is there to preserve compatibility with our existing Polly passes. ScopInfoRegionPass will return a SCoP object for a valid region while the creation of the SCoP stays in the ScopInfo class. Contributed-by: Utpal Bora <cs14mtech11017@iith.ac.in> Reviewed-by: Tobias Grosser <tobias@grosser.es>, Johannes Doerfert <doerfert@cs.uni-saarland.de> Differential Revision: http://reviews.llvm.org/D20770 llvm-svn: 271259	2016-05-31 09:41:04 +00:00
Michael Kruse	7410a27820	MSVC compile fix: #include <ciso646>. NFC. This header is required to make the ISO 646 alternative operator spellings ("and", "or" instead of "&&", "\|\|") work. Should these operators be replaced by the standard ones as already suggested by Johannes, also remove this #include again. llvm-svn: 271206	2016-05-30 14:27:14 +00:00
Roman Gareev	9c3eb5960a	Determination of statements that contain matrix multiplication Add determination of statements that contain, in particular, matrix multiplications and can be optimized with [1] to try to get close-to-peak performance. It can be enabled via polly-pm-based-opts, which is false by default. Refs: [1] - http://www.cs.utexas.edu/users/flame/pubs/TOMS-BLIS-Analytical.pdf Contributed-by: Roman Gareev <gareevroman@gmail.com> Reviewed-by: Tobias Grosser <tobias@grosser.es> Differential Revision: http://reviews.llvm.org/D20575 llvm-svn: 271128	2016-05-28 16:17:58 +00:00
Michael Kruse	315aa3278e	[ScheduleOptimizer] Add -polly-opt-outer-coincidence option. Add a command line switch to set the isl_options_set_schedule_outer_coincidence option. ISL then tries to build schedules where the outer member of a band satisfies the coincidence constraints. In practice this allows loop skewing for more parallelism in inner loops. llvm-svn: 268222	2016-05-02 11:35:27 +00:00
Johannes Doerfert	3c6a99b818	Add __isl_give annotations to return types [NFC] llvm-svn: 265882	2016-04-09 21:55:23 +00:00
Hongbin Zheng	2a798853f8	Allow the client of DependenceInfo to obtain dependences at different granularities. llvm-svn: 262591	2016-03-03 08:15:33 +00:00
Hongbin Zheng	defd098612	Adapt to LLVM head, again llvm-svn: 261905	2016-02-25 17:54:42 +00:00
Hongbin Zheng	566c614525	Revert "Adapt to LLVM head. NFC" This reverts commit 4d3753b9646a69c00d234ccd6e91dc3d0ea5d643. llvm-svn: 261892	2016-02-25 16:46:17 +00:00
Hongbin Zheng	f4e35f9cb9	Adapt to LLVM head. NFC llvm-svn: 261886	2016-02-25 16:36:09 +00:00
Roman Gareev	11001e1534	Annotation of SIMD loops Use 'mark' nodes annotate a SIMD loop during ScheduleTransformation and skip parallelism checks. The buildbot shows the following compile/execution time changes: Compile time: Improvements Δ Previous Current σ …/gesummv -6.06% 0.2640 0.2480 0.0055 …/gemver -4.46% 0.4480 0.4280 0.0044 …/covariance -4.31% 0.8360 0.8000 0.0065 …/adi -3.23% 0.9920 0.9600 0.0065 …/doitgen -2.53% 0.9480 0.9240 0.0090 …/3mm -2.33% 1.0320 1.0080 0.0087 Execution time: Regressions Δ Previous Current σ …/viterbi 1.70% 5.1840 5.2720 0.0074 …/smallpt 1.06% 12.4920 12.6240 0.0040 Reviewed-by: Tobias Grosser <tobias@grosser.es> Differential Revision: http://reviews.llvm.org/D14491 llvm-svn: 261620	2016-02-23 09:00:13 +00:00
Tobias Grosser	5624d3c978	Adjust formatting to clang-format changes in 256149 llvm-svn: 256151	2015-12-21 12:38:56 +00:00
Tobias Grosser	ca7f5bb767	Full/partial tile separation for vectorization We isolate full tiles from partial tiles to be able to, for example, vectorize loops with parametric lower and/or upper bounds. If we use -polly-vectorizer=stripmine, we can see execution-time improvements: correlation from 1m7361s to 0m5720s (-67.05 %), covariance from 1m5561s to 0m5680s (-63.50 %), ary3 from 2m3201s to 1m2361s (-46.72 %), CrystalMk from 8m5565s to 7m4285s (-13.18 %). The current full/partial tile separation increases compile-time more than necessary. As a result, we see in compile time regressions, for example, for 3mm from 0m6320s to 0m9881s (56.34%). Some of this compile time increase is expected as we generate more IR and consequently more time is spent in the LLVM backends. However, a first investiagation has shown that a larger portion of compile time is unnecessarily spent inside Polly's parallelism detection and could be eliminated by propagating existing knowledge about vector loop parallelism. Before enabling -polly-vectorizer=stripmine by default, it is necessary to address this compile-time issue. Contributed-by: Roman Gareev <gareevroman@gmail.com> Reviewers: jdoerfert, grosser Subscribers: grosser, #polly Differential Revision: http://reviews.llvm.org/D13779 llvm-svn: 250809	2015-10-20 09:12:21 +00:00
Johannes Doerfert	01978cfa0c	Remove independent blocks pass Polly can now be used as a analysis only tool as long as the code generation is disabled. However, we do not have an alternative to the independent blocks pass in place yet, though in the relevant cases this does not seem to impact the performance much. Nevertheless, a virtual alternative that allows the same transformations without changing the input region will follow shortly. llvm-svn: 250652	2015-10-18 12:28:00 +00:00
Tobias Grosser	0e3a6b13a4	Sort includes using 'clang-format -sort-includes' llvm-svn: 250392	2015-10-15 12:17:36 +00:00
Tobias Grosser	f30be2f370	RegisterPasses: Optionally run inliner before Polly This will allow us to optimize C++ template code with Polly. This support is mostly for debugging purpose and individual experiments. The ultimate goal is still to run Polly later in the pass manager when inlining already happened. llvm-svn: 250092	2015-10-12 20:03:44 +00:00
Johannes Doerfert	f363ed9804	[NFC] Move helper functions to ScopHelper Helper functions in the BlockGenerators.h/cpp introduce dependences from the frontend to the backend of Polly. As they are used in ScopDetection, ScopInfo, etc. we move them to the ScopHelper file. llvm-svn: 249919	2015-10-09 23:40:24 +00:00
Johannes Doerfert	45be64464b	[NFC] Consistenly use commented and annotated ScopPass functions The changes affect methods that are part of the Pass interface and include: - Comments that describe the methods purpose. - A consistent use of the keywords override and virtual. Additionally, the printScop method is now optional and removed from SCoP passes that do not implement it. llvm-svn: 248685	2015-09-27 15:43:29 +00:00
Johannes Doerfert	0f37630849	[NFC] Use releaseMemory to release internal memory llvm-svn: 248684	2015-09-27 15:42:28 +00:00
Chandler Carruth	66ef16b289	[PM] Update Polly for the new AA infrastructure landed in r247167. llvm-svn: 247198	2015-09-09 22:13:56 +00:00
Tobias Grosser	fa57e9b7e6	Make our data-locality schedule tree transforms externally accessible Other passes which perform different optimizations might be interested in also applying data-locality transformations as part of their overall transformation. llvm-svn: 245824	2015-08-24 06:01:47 +00:00
Tobias Grosser	1ac884d73a	Use marker nodes to annotate the different levels of tiling Currently, marker nodes are ignored during AST generation, but visible in the -debug-only=polly-ast output. llvm-svn: 245809	2015-08-23 09:11:00 +00:00
Tobias Grosser	fc490a99f5	Do really not unroll the vector loop in combination with register tiling The previous commit lacked a test case for register tiling + pre-vectorization and we obviously got it immediately wrong. llvm-svn: 245599	2015-08-20 19:08:16 +00:00
Tobias Grosser	42e2489553	Add experimental support for trivial register tiling Register tiling in Polly is for now just an additional level of tiling which is fully unrolled. It is disabled by default. To make this useful for more than experiments, we still need a cost function as well as possibly further optimizations that teach LLVM to actually put some of the values we got into scalar registers. llvm-svn: 245564	2015-08-20 13:45:05 +00:00
Tobias Grosser	0483271662	Add support for two-level tiling By default we only use one level of tiling for loops, but in general tiling for multiple levels is trivial for us. Hence, we add a set of options that allow people to play with a second level of tiling. If this is profitable for some cases we can work on heuristics that allow us to identify these cases and use two-level tiling for them. llvm-svn: 245563	2015-08-20 13:45:02 +00:00
Tobias Grosser	862b9b5239	Factor out check for tileable band node. llvm-svn: 245559	2015-08-20 12:32:45 +00:00
Tobias Grosser	9bdea573bd	Introduce tileBand function to simplify code llvm-svn: 245558	2015-08-20 12:22:37 +00:00
Tobias Grosser	d891b54132	Add some forgotten isl memory annotations llvm-svn: 245557	2015-08-20 12:16:23 +00:00
Tobias Grosser	07c1c2fcc9	Make prevectorization width configurable Polly uses 'prevectorization' to enable outer loop vectorization. When vectorizing an outer loop, we strip-mine <number-of-prevec-dims> loop iterations which are than interchanged to the innermost level such that LLVM's inner loop vectorizer (or Polly's simple vectorizer) can easily vectorize this loop. The number of loop iterations to strip-mine is now configurable with the option -polly-prevect-width=<number-of-prevec-dims>. This is mostly a debugging option. We should probably add a heuristic that derives the number of prevectorization dimensions from the target data and the data types used. llvm-svn: 245424	2015-08-19 08:46:11 +00:00
Tobias Grosser	161c9081e5	Do not use negative option name Instead of -polly-no-tiling, we use -polly-tiling=false to disable tiling. llvm-svn: 245423	2015-08-19 08:22:06 +00:00
Tobias Grosser	f10f4636ff	Simplify tiling code a bit We only need to allocate the tile size vector if we actually want to perform a tiling. llvm-svn: 245422	2015-08-19 08:03:37 +00:00
Tobias Grosser	77c0f5a3b7	Drop dead and disable code from IndependentBlocks Since Polly has now support for the code generation of scalar and PHI dependences this code was unused and is now dropped. llvm-svn: 245284	2015-08-18 09:30:28 +00:00
Tobias Grosser	c5bcf246d1	Fix Polly after SCEV port to new pass manager This fixes compilation after LLVM commit r245193. llvm-svn: 245211	2015-08-17 10:57:08 +00:00
Tobias Grosser	234a48270e	AST Generation Paper published in TOPLAS The July issue of TOPLAS contains a 50 page discussion of the AST generation techniques used in Polly. This discussion gives not only an in-depth description of how we (re)generate an imperative AST from our polyhedral based mathematical program description, but also gives interesting insights about: - Schedule trees: A tree-based mathematical program description that enables us to perform loop transformations on an abstract level, while issues like the generation of the correct loop structure and loop bounds will be taken care of by our AST generator. - Polyhedral unrolling: We discuss techniques that allow the unrolling of non-trivial loops in the context of parameteric loop bounds, complex tile shapes and conditionally executed statements. Such unrolling support enables the generation of predicated code e.g. in the context of GPGPU computing. - Isolation for full/partial tile separation: We discuss native support for handling full/partial tile separation and -- in general -- native support for isolation of boundary cases to enable smooth code generation for core computations. - AST generation with modulo constraints: We discuss how modulo mappings are lowered to efficient C/LLVM code. - User-defined constraint sets for run-time checks We discuss how arbitrary sets of constraints can be used to automatically create run-time checks that ensure a set of constrainst actually hold. This feature is very useful to verify at run-time various assumptions that have been taken program optimization. Polyhedral AST generation is more than scanning polyhedra Tobias Grosser, Sven Verdoolaege, Albert Cohen ACM Transations on Programming Languages and Systems (TOPLAS), 37(4), July 2015 llvm-svn: 245157	2015-08-15 09:34:33 +00:00
Michael Kruse	1d3c9b54fb	Remove leftover comment The function to which this commit applies has been removed in a previous commit. llvm-svn: 244450	2015-08-10 15:07:16 +00:00
Michael Kruse	fd613545cb	[Polly] Remove dead code in IndependentBlocks Summary: The splitExitBlock function is never called. Going to replace its functionality in successive patches that do not modify the IR. Reviewers: grosser Subscribers: pollydev Projects: #polly Differential Revision: http://reviews.llvm.org/D11865 llvm-svn: 244404	2015-08-08 20:31:20 +00:00
Tobias Grosser	b241d928bd	Rewrite getPrevectorMap using schedule trees operations Schedule trees are a lot easier to work with, for both humans and machines. For humans the more structured schedule representation is easier to reason about. Together with the more abstract isl programming interface this can result in a lot cleaner code (see this changeset). For machines, the structured schedule and the fact that we now use explicit piecewise affine expressions instead of integer maps makes it easier to generate code from this schedule tree. As a result, we can already see a slight compile-time improvement -- for 3mm from 0m0.593s to 0m0.551s seconds (-7 %). More importantly, future optimizations such as full-partial tile separation will most likely result in more streamlined code to be generated. Contributed-by: Roman Gareev <gareevroman@gmail.com> llvm-svn: 243458	2015-07-28 18:03:36 +00:00
Tobias Grosser	2764794ba4	Simplify some isl expression we use Suggested-by: Sven Verdoolaege <skimo-polly@kotnet.org> llvm-svn: 243254	2015-07-26 19:22:35 +00:00
Tobias Grosser	3b10c94062	Prevectorize the schedule of the band (or the point loop in case of tiling) Contributed-by: Roman Gareev <gareevroman@gmail.com> llvm-svn: 243214	2015-07-25 12:28:56 +00:00
Tobias Grosser	808cd69a92	Use schedule trees to represent execution order of statements Instead of flat schedules, we now use so-called schedule trees to represent the execution order of the statements in a SCoP. Schedule trees make it a lot easier to analyze, understand and modify properties of a schedule, as specific nodes in the tree can be choosen and possibly replaced. This patch does not yet fully move our DependenceInfo pass to schedule trees, as some additional performance analysis is needed here. (In general schedule trees should be faster in compile-time, as the more structured representation is generally easier to analyze and work with). We also can not yet perform the reduction analysis on schedule trees. For more information regarding schedule trees, please see Section 6 of https://lirias.kuleuven.be/handle/123456789/497238 llvm-svn: 242130	2015-07-14 09:33:13 +00:00
Tobias Grosser	af4e809ca6	Remove code for scalar and PHI to array translation This removes old code that has been disabled since several weeks and was hidden behind the flags -disable-polly-intra-scop-scalar-to-array=false and -polly-model-phi-nodes=false. Earlier, Polly used to translate scalars and PHI nodes to single element arrays, as this avoided the need for their special handling in Polly. With Johannes' patches adding native support for such scalar references to Polly, this code is not needed any more. After this commit both -polly-prepare and -polly-independent are now mostly no-ops. Only a couple of simple transformations still remain, but they are scheduled for removal too. Thanks again to Johannes Doerfert for his nice work in making all this code obsolete. llvm-svn: 240766	2015-06-26 07:31:18 +00:00
Michael Kruse	c59f22c556	Update ISL to isl-0.15-3-g532568a This version adds small integer optimization, but is not active by default. It will be enabled in a later commit. The schedule-fuse=min/max option has been replaced by the serialize-sccs option. Adapting Polly was necessary, but retaining the name polly-opt-fusion=min/max. Differential Revision: http://reviews.llvm.org/D10505 Reviewers: grosser llvm-svn: 240027	2015-06-18 16:45:40 +00:00
Tobias Grosser	97d8745087	Dump YAML schedule tree as properly indented tree in DEBUG output llvm-svn: 238645	2015-05-30 06:46:59 +00:00
Tobias Grosser	3e77d14563	Add indvar pass to canonicalization sequence Running indvar before Polly is useful as this eliminates zexts as they commonly appear when a 32 bit induction variable (type int) was used on a 64 bit system. These zexts confuse our delinearization and prevent for example the successful delinearization of the nussinov kernel in polybench-c-4.1. This fixes http://llvm.org/PR23426 Suggested-by: Xing Su <xsu.llvm@outlook.com> llvm-svn: 238643	2015-05-30 06:16:41 +00:00
Tobias Grosser	b2f399264d	Update isl to 93b8e43d This update brings mostly interface cleanups, but also fixes two bugs in imath (a memory leak, some undefined behavior). llvm-svn: 238422	2015-05-28 13:32:11 +00:00
Tobias Grosser	7c3bad52dd	Use value semantics for list of ScopStmt(s) instead of std::owningptr David Blaike suggested this as an alternative to the use of owningptr(s) for our memory management, as value semantics allow to avoid the additional interface complexity caused by owningptr while still providing similar memory consistency guarantees. We could also have used a std::vector, but the use of std::vector would yield possibly changing pointers which currently causes problems as for example the memory accesses carry pointers to their parent statements. Such pointers should not change. Reviewer: jblaikie, jdoerfert Differential Revision: http://reviews.llvm.org/D10041 llvm-svn: 238290	2015-05-27 05:16:57 +00:00
Tobias Grosser	679dfafd33	Use unique_ptr to clarify ownership of ScopStmt llvm-svn: 238090	2015-05-23 05:14:09 +00:00
Tobias Grosser	ac60f4594f	Enable scalar and PHI code generation for Polly The feature itself has been committed by Johannes in r238070. As this is the way forward, we now enable it to ensure we get test coverage. Thank you Johannes for this nice work! llvm-svn: 238088	2015-05-23 03:34:41 +00:00
Tobias Grosser	1b6ea573f2	Replace low-level constraint building with higher level functions Instead of explicitly building constraints and adding them to our maps we now use functions like map_order_le to add the relevant information to the maps. llvm-svn: 237934	2015-05-21 19:02:44 +00:00
Tobias Grosser	cd524dc51d	Add explicit #includes for used isl features llvm-svn: 236931	2015-05-09 09:36:38 +00:00
Tobias Grosser	ba0d09227c	Sort include directives Upcoming revisions of isl require us to include header files explicitly, which have previously been already transitively included. Before we add them, we sort the existing includes. Thanks to Chandler for sort_includes.py. A simple, but very convenient script. llvm-svn: 236930	2015-05-09 09:13:42 +00:00
Tobias Grosser	5483931117	Rename 'scattering' to 'schedule' In Polly we used both the term 'scattering' and the term 'schedule' to describe the execution order of a statement without actually distinguishing between them. We now uniformly use the term 'schedule' for the execution order. This corresponds to the terminology of isl. History: CLooG introduced the term scattering as the generated code can be used as a sequential execution order (schedule) or as a parallel dimension enumerating different threads of execution (placement). In Polly and/or isl the term placement was never used, but we uniformly refer to an execution order as a schedule and only later introduce parallelism. When doing so we do not talk about about specific placement dimensions. llvm-svn: 235380	2015-04-21 11:37:25 +00:00
Tobias Grosser	02cf69a6ed	Make -polly-no-tiling work again llvm-svn: 234125	2015-04-05 21:52:21 +00:00
Tobias Grosser	4f6bceface	Do not scale tile loops We now generate tile loops as: for (int c1 = 0; c1 <= 47; c1 += 1) for (int c2 = 0; c2 <= 47; c2 += 1) for (int c3 = 0; c3 <= 31; c3 += 1) for (int c4 = 0; c4 <= 31; c4 += 4) #pragma simd for (int c5 = c4; c5 <= c4 + 3; c5 += 1) Stmt_for_body3(32 * c1 + c3, 32 * c2 + c5); instead of for (int c1 = 0; c1 <= 1535; c1 += 32) for (int c2 = 0; c2 <= 1535; c2 += 32) for (int c3 = 0; c3 <= 31; c3 += 1) for (int c4 = 0; c4 <= 31; c4 += 4) #pragma simd for (int c5 = c4; c5 <= c4 + 3; c5 += 1) Stmt_for_body3(c1 + c3, c2 + c5); Run-time performance-wise this makes little difference, but this gives a large reduction in compile time (10-30% on 17 LNT benchmarks). Apparently the isl AST generator is not yet very efficient in generating the latter. llvm-svn: 233675	2015-03-31 07:52:36 +00:00
Tobias Grosser	378e003748	Drop libpluto support We do not have buildbots or anything that tests this functionality, hence it most likely bitrots. People interested to use this functionality can always recover it from svn history. llvm-svn: 233570	2015-03-30 17:54:01 +00:00
Tobias Grosser	bbb4cec2e8	Use schedule trees to perform post-scheduling transformations Replacing the old band_tree based code with code that is based on the new schedule tree [1] interface makes applying complex schedule transformations a lot more straightforward. We now do not need to reason about the meaning of flat schedules, but can use a more straightforward tree structure. We do not yet exploit this a lot in the current code, but hopefully we will be able to do so soon. This change also allows us to drop some code, as isl now provides some higher level interfaces to apply loop transformations such as tiling. This change causes some small test case changes as isl uses a slightly different way to perform loop tiling, but no significant functional changes are intended. [1] http://impact.gforge.inria.fr/impact2014/papers/impact2014-verdoolaege.pdf llvm-svn: 232911	2015-03-22 12:06:39 +00:00
Tobias Grosser	442c6ccb8c	Add some missing __isl_give/__isl_keep annotations llvm-svn: 232711	2015-03-19 07:43:35 +00:00
Johannes Doerfert	7e6424ba5a	Create a dependence struct to hold dependence information for a SCoP. The new Dependences struct in the DependenceInfo holds all information that was formerly part of the DependenceInfo. It also provides the same interface for the user to access this information. This is another step to a more general ScopPass interface that does allow multiple SCoPs to be "in flight". llvm-svn: 231327	2015-03-05 00:43:48 +00:00
Johannes Doerfert	f6557f98a2	Rename the Dependences pass to DependenceInfo [NFC] We rename the Dependences pass to DependenceInfo as a first step to a caching pass policy. The new DependenceInfo pass will later provide "Dependences" for a SCoP. To keep consistency the test folder is renamed too. llvm-svn: 231308	2015-03-04 22:43:40 +00:00
Johannes Doerfert	909a3bf21d	[Refactor] Use virtual and override appropriately + Add override for overwritten methods. + Remove virtual for methods we do not want to be overwritten. llvm-svn: 230898	2015-03-01 18:42:08 +00:00
Johannes Doerfert	3fe584d64f	[Refactor] Add a Scop & as argument to printScop This is the first step in the interface simplification. llvm-svn: 230897	2015-03-01 18:40:25 +00:00
Johannes Doerfert	5079200510	Do some preparation even with scalar and phi modeling enabled llvm-svn: 230790	2015-02-27 20:38:51 +00:00
Tobias Grosser	c3fe35df4c	Fix formatting llvm-svn: 229360	2015-02-16 06:40:23 +00:00
David Blaikie	c4d7bc3fcc	Update Polly for the removal of LLVM_DELETED_FUNCTION now that '= delete' works on all supported compilers (MSVC2012 compat has been dropped) llvm-svn: 229344	2015-02-15 23:40:18 +00:00
Johannes Doerfert	6f7921f2be	Do not try to optimize empty SCoPs. llvm-svn: 229253	2015-02-14 12:02:24 +00:00
Chandler Carruth	d01918fa13	[PM] Convert Polly over to directly use the legacy pass manager namespace and header rather than the top-level header and using declarations. These helpers impede modular builds and are going away. Migrating away from them will also be necessary to start mixing in any usage of the new pass manager. llvm-svn: 229091	2015-02-13 09:51:50 +00:00
Johannes Doerfert	7ceb040213	Add early exits for SCoPs we did not optimize This allows us to skip ast and code generation if we did not optimize a SCoP and will not generate parallel or alias annotations. The initial heuristic to exit is simple but allows improvements later on. All failing test cases have been modified to disable early exit, thus to keep their coverage. Differential Revision: http://reviews.llvm.org/D7254 llvm-svn: 228851	2015-02-11 17:25:09 +00:00
Johannes Doerfert	4a60b173a7	Do not run independent blocks when we model all scalar dependences llvm-svn: 228441	2015-02-06 21:26:45 +00:00
Johannes Doerfert	0ff23ec544	Model PHI nodes without demoting them This allows us to model PHI nodes in the polyhedral description without demoting them. The modeling however will result in the same accesses as the demotion would have introduced. Differential Revision: http://reviews.llvm.org/D7415 llvm-svn: 228433	2015-02-06 20:13:15 +00:00
Johannes Doerfert	9e3a5db000	[FIX] Debug build + instrinsic handling The ignored intrinsics needed to be ignored in three other places as well. Tests and lnt pass now. llvm-svn: 227092	2015-01-26 15:55:54 +00:00
Johannes Doerfert	07e8a406d6	[FIX] Independent blocks with intrinsics handling Also an old option was removed from some new test cases llvm-svn: 227057	2015-01-25 19:09:49 +00:00
Tobias Grosser	7a08488ca6	Drop an unused parameter llvm-svn: 226739	2015-01-21 23:11:46 +00:00
Chandler Carruth	f557987b15	[PM] Update Polly following LLVM r226373 which refactors LoopInfo in preparation for the new pass manager. llvm-svn: 226374	2015-01-17 14:16:56 +00:00
Tobias Grosser	11e3873516	Dead code elimination: Update dependences after eliminating code Without updating dependences we may lose implicit transitive dependences for which all explicit dependences have gone through the statement iterations we have just eliminated. No test case. We should probably implement a -verify-dependences option. This fixes llvm.org/PR21227 llvm-svn: 224459	2014-12-17 21:13:55 +00:00
Johannes Doerfert	305fed96e6	Drop Cloog support This commit drops the Cloog support for Polly. The scripts and documentation are changed to only use isl as prerequisity. In the code all Cloog specific parts have been removed and all relevant tests have been ported to the isl backend when it was created. llvm-svn: 223141	2014-12-02 19:26:58 +00:00
Tobias Grosser	71badac9d6	Remove Polly's IndVarSimplify pass Polly had a copy of this pass to create the canonical induction variables necessary for the non-scev-based code generation. As we now always use SCEV based code generation, canonical induction variables are not needed any more. llvm-svn: 222979	2014-11-30 14:33:41 +00:00
Tobias Grosser	683b8e4462	Remove -polly-codegen-scev option and related code SCEV based code generation has been the default for two weeks after having been tested for a long time. We now drop the support the non-scev-based code generation. llvm-svn: 222978	2014-11-30 14:33:31 +00:00
Tobias Grosser	422b30a017	Use new Small(Ptr)Set API This fixes the recent build failures. llvm-svn: 222358	2014-11-19 14:32:32 +00:00
Tobias Grosser	2f8732e7c6	Independent blocks: SE->forget() scalars translated to arrays This prevents SCEVs to reference values not valid any more and as a consequence solves a bug where such values reintroduced during ast generation caused the independent blocks pass to fail validation. http://llvm.org/PR21204 llvm-svn: 222103	2014-11-16 20:33:58 +00:00
Tobias Grosser	01aea5809f	Use stringFromIslObj instead of isl_..._dump to print to dbgs() This makes sure we consistently use dbgs() when printing debug output. Previously, the code just mixed calls to isl_*_dump() with printing to dbgs() and was relying for both methods to interact in predictable ways (same output stream, no unexpected reordering of outputs). llvm-svn: 220443	2014-10-22 23:16:28 +00:00
Johannes Doerfert	495dd053ed	[Fix] isl usage errors in ScheduleOptimizer llvm-svn: 216084	2014-08-20 17:15:34 +00:00
Johannes Doerfert	9e7b17b0d4	Added arcanist linters and cleaned errors and warnings Arcanist (arc) will now always run linters before uploading any new commit to Phabricator. All errors/warnings (or their absence) will be shown in the web interface together with a explanation by the commiter (arcanist will ask the commiter if the build was not clean). The linters include: - clang-format - spelling check - permissions check (aka. chmod) - filename check - merge conflict marker check Note, that their scope is sometimes limited (see .arclint for details). This commit also fixes all errors and warnings these linters reported, namely: - spelling mistakes and typos - executable permissions for various text files Differential Revision: http://reviews.llvm.org/D4916 llvm-svn: 215871	2014-08-18 00:40:13 +00:00
Johannes Doerfert	5aa2194ea5	[Polly] Remove the PoCC and ScopLib support Remove the PoCC and ScopLib support from Polly as we do not have a user/maintainer for it. Differential Revision: http://reviews.llvm.org/D4871 llvm-svn: 215563	2014-08-13 17:49:16 +00:00
Tobias Grosser	5b5fd4e27c	No need to run -mem2reg twice llvm-svn: 214632	2014-08-02 13:37:25 +00:00
Matt Arsenault	8ca36815ee	Update for RegionInfo changes. Mostly related to missing includes and renaming of the pass to RegionInfoPass. llvm-svn: 213457	2014-07-19 18:40:17 +00:00
Tobias Grosser	c2920ff747	DeadCodeElimination: Fix liveout computation We move back to a simple approach where the liveout is the last must-write statement for a data-location plus all may-write statements. The previous approach did not work out. We would have to consider per-data-access dependences, instead of per-statement dependences to correct it. As this adds complexity and it seems we would not gain anything over the simpler approach that we implement in this commit, I moved us back to the old approach of computing the liveout, but enhanced it to also add may-write accesses. We also fix the test case and explain why we can not perform dead code elimination in this case. llvm-svn: 212925	2014-07-14 08:32:01 +00:00
Tobias Grosser	e8162928c8	Remove unnecessary isl annotations They where just left over from copy-pasting. Reported-by: Johannes Doerfert <jdoerfert@codeaurora.org> llvm-svn: 212800	2014-07-11 09:02:41 +00:00
Tobias Grosser	780ce0f8e3	DeadCodeElim: Compute correct liveout for non-affine accesses Thanks to Johannes Doerfert for narrowing down the bug. Reported-by: Chris Jenneisch <chrisj@codeaurora.org> llvm-svn: 212796	2014-07-11 07:12:10 +00:00
Tobias Grosser	483a90d1bd	clang-format polly to avoid buildbot noise llvm-svn: 212609	2014-07-09 10:50:10 +00:00
Tobias Grosser	083d3d3cb3	[C++11] Use more range based fors llvm-svn: 211981	2014-06-28 08:59:45 +00:00
Johannes Doerfert	f1906138b4	Model statement wise reduction dependences + Collect reduction dependences + Introduced TYPE_RED in Dependences.h which can be used to obtain the reduction dependences + Used TYPE_RED to prevent parallelization while we do not have a privatizing code generation + Relax the dependences for non-parallel code generation + Add privatization dependences to ensure correctness + 12 Test cases to check for reduction and privatization dependences llvm-svn: 211369	2014-06-20 16:37:11 +00:00
Johannes Doerfert	aeed39774d	Fix build See r210927 and r210847 llvm-svn: 211278	2014-06-19 16:19:32 +00:00
Johannes Doerfert	c3958b214c	Added option for n-dimensional rectangular tiling + CL-option --polly-tile-sizes=<int,...,int> The i'th value is used as a tile size for dimension i, if there is no i'th value, the value of --polly-default-tile-size is used + CL-option --polly-default-tile-size=int Used if no tile size is given for a dimension i + 3 Simple testcases llvm-svn: 209753	2014-05-28 17:21:02 +00:00
Saleem Abdulrasool	e653622b98	polly: update for LLVM API change SVN r209103 removed the OwningPtr variant of the MemoryBuffer APIs. Switch to the equivalent std::unique_ptr versions. This should clear up the build bots. llvm-svn: 209104	2014-05-19 03:55:49 +00:00
Chandler Carruth	95fef9446c	[Modules] Fix potential ODR violations by sinking the DEBUG_TYPE definition below all of the header #include lines, Polly edition. If you want to know more details about this, you can see the recent commits to Debug.h in LLVM. This is just the Polly segment of a cleanup I'm doing globally for this macro. llvm-svn: 206852	2014-04-22 03:30:19 +00:00
Tobias Grosser	5a56cbf496	[C++11] Use nullptr llvm-svn: 206361	2014-04-16 07:33:47 +00:00
Tobias Grosser	e6c9c85bc8	Fix formatting llvm-svn: 206333	2014-04-15 22:30:10 +00:00
Tobias Grosser	c787b12d04	Avoid -Wunused-const-variable warning llvm-svn: 206329	2014-04-15 22:18:37 +00:00
Chandler Carruth	1fc97224af	Fix more build errors in Polly after r206310. David caught one of these in r206312, but others don't seem to show up on build bots? Unsure of why, they showed up for me. llvm-svn: 206326	2014-04-15 21:48:34 +00:00
Tobias Grosser	20532b8e1b	Fixed gcc build warnings + vim 'fixed' line endings in json_value.cpp Contributed-by: Johannes Doerfert <doerfert@cs.uni-saarland.de> llvm-svn: 206044	2014-04-11 17:56:49 +00:00
Tobias Grosser	64b95123ef	Delete trivial PHI nodes (aka stack slot sharing) During code preperation trivial PHI nodes (mainly introduced by lcssa) are deleted to decrease the number of introduced allocas (==> dependences). However simply replacing them by their only incoming value would cause the independent block pass to introduce new allocas. To prevent this we try to share stack slots during code preperarion, hence to reuse a already created alloca 'to demote' the trivial PHI node. This works if we know that the value stored in this alloca will be the incoming value of the trivial PHI at the end of the predecessor block of this trivial PHI. Contributed-by: Johannes Doerfert <doerfert@cs.uni-saarland.de> llvm-svn: 205320	2014-04-01 16:01:33 +00:00
Tobias Grosser	2f4529f864	clang-format: Remove empty lines llvm-svn: 204468	2014-03-21 14:04:25 +00:00
Andreas Simbuerger	84e0723af8	(Make) Remove unused Makefiles llvm-svn: 203957	2014-03-14 18:25:31 +00:00
Tobias Grosser	09f459719e	[libpluto] Make more pluto options accessible Contributed-by: Sam Novak <snovak@uwsp.edu> llvm-svn: 203871	2014-03-13 23:37:48 +00:00
Tobias Grosser	64e8e37dee	Allow several polly command line options to be provided multiple times Contributed-by: Sam Novak <snovak@uwsp.edu> llvm-svn: 203869	2014-03-13 23:37:43 +00:00
Tobias Grosser	1f1c916074	[autoconf] Add Transform/ directory Contributed-by: Sam Novak <snovak@uwsp.edu> llvm-svn: 203868	2014-03-13 23:37:37 +00:00
Andreas Simbuerger	19523ed2be	Move transformations into own directory Move all transformations into their own directory. CMakeLists are adjusted accordingly. llvm-svn: 203607	2014-03-11 21:25:59 +00:00

... 4 5 6 7 8 ...

409 Commits