llvm-project

Commit Graph

Author	SHA1	Message	Date
Michael Kruse	4b5f6af2dc	[cmake] Move isl_test artifacts to Polly folder. Folders in Visual Studio solutions help organize the build artifacts from all LLVM projects. There is a folder to keep Polly-built files in. llvm-svn: 283546	2016-10-07 12:38:24 +00:00
Tobias Grosser	e84ee850d1	Build and run isl_test as part of check-polly Running isl tests is important to gain confidence that the isl build we created works as expected. Besides the actual isl tests, there are also isl AST generation tests shipped with isl. This change only adds support for the isl unit tests. AST generation test support is left for a later commit. There is a choice to run tests directly through the build system or in the context of lit. We choose to run tests as part of lit to as this allows us to easily set environment variables, print output only on error and generally run the tests directly from the lit command. Reviewers: brad.king, Meinersbur Subscribers: modocache, brad.king, pollydev, beanz, llvm-commits, mgorny Differential Revision: https://reviews.llvm.org/D25155 llvm-svn: 283245	2016-10-04 19:48:40 +00:00
Michael Kruse	6ab4476835	[ScopInfo] Add -polly-unprofitable-scalar-accs option. With this option one can disable the heuristic that assumes that statements with a scalar write access cannot be profitably optimized. Such a statement instances necessarily have WAW-dependences to itself. With DeLICM scalar accesses can be changed to array accesses, which can avoid these WAW-dependence. llvm-svn: 283233	2016-10-04 17:33:39 +00:00
Michael Kruse	ca7cbcca37	[ScopInfo] Scalar access do not have indirect base pointers. ScopArrayInfo used to determine base pointer origins by looking up whether the base pointer is a load. The "base pointer" for scalar accesses is the llvm::Value being accessed. This is only a symbolic base pointer, it represents the alloca variable (.s2a or .phiops) generated for it at code generation. This patch disables determining base pointer origin for scalars. A test case where this caused a crash will be added in the next commit. In that test SAI tried to get the origin base pointer that was only declared later, therefore not existing. This is probably only possible for scalars used in PHINode incoming blocks. llvm-svn: 283232	2016-10-04 17:33:34 +00:00
Tobias Grosser	349d1c3368	[ScopDetection] Remove redundant checks for endless loops Summary: Both `canUseISLTripCount()` and `addOverApproximatedRegion()` contained checks to reject endless loops which are now removed and replaced by a single check in `isValidLoop()`. For reporting such loops the `ReportLoopOverlapWithNonAffineSubRegion` is renamed to `ReportLoopHasNoExit`. The test case `ReportLoopOverlapWithNonAffineSubRegion.ll` is adapted and renamed as well. The schedule generation in `buildSchedule()` is based on the following assumption: Given some block B that is contained in a loop L and a SESE region R, we assume that L is contained in R or the other way around. However, this assumption is broken in the presence of endless loops that are nested inside other loops. Therefore, in order to prevent erroneous behavior in `buildSchedule()`, r265280 introduced a corresponding check in `canUseISLTripCount()` to reject endless loops. Unfortunately, it was possible to bypass this check with -polly-allow-nonaffine-loops which was fixed by adding another check to reject endless loops in `allowOverApproximatedRegion()` in r273905. Hence there existed two separate locations that handled this case. Thank you Johannes Doerfert for helping to provide the above background information. Reviewers: Meinersbur, grosser Subscribers: _jdoerfert, pollydev Differential Revision: https://reviews.llvm.org/D24560 Contributed-by: Matthias Reisinger <d412vv1n@gmail.com> llvm-svn: 281987	2016-09-20 17:05:22 +00:00
Tobias Grosser	122d6d74f6	Fix spelling in CMakeLists llvm-svn: 281897	2016-09-19 10:55:31 +00:00
Tobias Grosser	05ee64e67a	GPGPU: add missing REQUIRES line to test case llvm-svn: 281850	2016-09-18 08:57:38 +00:00
Tobias Grosser	bc653f2031	GPGPU: Do not run mostly sequential kernels in GPU In case sequential kernels are found deeper in the loop tree than any parallel kernel, the overall scop is probably mostly sequential. Hence, run it on the CPU. llvm-svn: 281849	2016-09-18 08:31:09 +00:00
Tobias Grosser	82f2af3508	GPGPU: Dynamically ensure 'sufficient compute' Offloading to a GPU is only beneficial if there is a sufficient amount of compute that can be accelerated. Many kernels just have a very small number of dynamic compute, which means GPU acceleration is not beneficial. We compute at run-time an approximation of how many dynamic instructions will be executed and fall back to CPU code in case this number is not sufficiently large. To keep the run-time checking code simple, we over-approximate the number of instructions executed in each statement by computing the volume of the rectangular hull of its iteration space. llvm-svn: 281848	2016-09-18 06:50:35 +00:00
Tobias Grosser	cfdee6582b	GPGPU: Make test cases independent of register numbering [NFC] llvm-svn: 281847	2016-09-18 06:50:28 +00:00
Tobias Grosser	51dfc27589	GPGPU: Store back non-read-only scalars We may generate GPU kernels that store into scalars in case we run some sequential code on the GPU because the remaining data is expected to already be on the GPU. For these kernels it is important to not keep the scalar values in thread-local registers, but to store them back to the corresponding device memory objects that backs them up. We currently only store scalars back at the end of a kernel. This is only correct if precisely one thread is executed. In case more than one thread may be run, we currently invalidate the scop. To support such cases correctly, we would need to always load and store back from a corresponding global memory slot instead of a thread-local alloca slot. llvm-svn: 281838	2016-09-17 19:22:31 +00:00
Tobias Grosser	fe74a7a1f5	GPGPU: Detect read-only scalar arrays ... and pass these by value rather than by reference. llvm-svn: 281837	2016-09-17 19:22:18 +00:00
Tobias Grosser	aaabbbf886	GPGPU: Do not assume arrays start at 0 Our alias checks precisely check that the minimal and maximal accessed elements do not overlap in a kernel. Hence, we must ensure that our host <-> device transfers do not touch additional memory locations that are not covered in the alias check. To ensure this, we make sure that the data we copy for a given array is only the data from the smallest element accessed to the largest element accessed. We also adjust the size of the array according to the offset at which the array is actually accessed. An interesting result of this is: In case array are accessed with negative subscripts ,e.g., A[-100], we automatically allocate and transfer _more_ data to cover the full array. This is important as such code indeed exists in the wild. llvm-svn: 281611	2016-09-15 14:05:58 +00:00
Roman Gareev	b3224adfb6	Perform copying to created arrays according to the packing transformation This is the fourth patch to apply the BLIS matmul optimization pattern on matmul kernels (http://www.cs.utexas.edu/users/flame/pubs/TOMS-BLIS-Analytical.pdf). BLIS implements gemm as three nested loops around a macro-kernel, plus two packing routines. The macro-kernel is implemented in terms of two additional loops around a micro-kernel. The micro-kernel is a loop around a rank-1 (i.e., outer product) update. In this change we perform copying to created arrays, which is the last step to implement the packing transformation. Reviewed-by: Tobias Grosser <tobias@grosser.es> Differential Revision: https://reviews.llvm.org/D23260 llvm-svn: 281441	2016-09-14 06:26:09 +00:00
Tobias Grosser	a82c4b5df8	GPGPU: Allow region statements llvm-svn: 281305	2016-09-13 08:42:10 +00:00
Tobias Grosser	b79f4d3970	GPGPU: Extend types when array sizes have smaller types This prevents a compiler crash. llvm-svn: 281303	2016-09-13 08:02:14 +00:00
Tobias Grosser	b51d507c74	Adapt test case to recent change in Global Variable Definition llvm-svn: 281295	2016-09-13 05:19:26 +00:00
Roman Gareev	f5aff70405	Store the size of the outermost dimension in case of newly created arrays that require memory allocation. We do not need the size of the outermost dimension in most cases, but if we allocate memory for newly created arrays, that size is needed. Reviewed-by: Michael Kruse <llvm@meinersbur.de> Differential Revision: https://reviews.llvm.org/D23991 llvm-svn: 281234	2016-09-12 17:08:31 +00:00
Tobias Grosser	5857b701a3	GPGPU: Bail out gracefully in case of invalid IR Instead of aborting, we now bail out gracefully in case the kernel IR we generate is invalid. This can currently happen in case the SCoP stores pointer values, which we model as arrays, as data values into other arrays. In this case, the original pointer value is not available on the device and can consequently not be stored. As detecting this ahead of time is not so easy, we detect these situations after the invalid IR has been generated and bail out. llvm-svn: 281193	2016-09-12 06:06:31 +00:00
Tobias Grosser	0bf4cc6499	Add missing 'REQUIRES' line llvm-svn: 281166	2016-09-11 13:42:42 +00:00
Tobias Grosser	02293ed755	GPGPU: Do not fail in case of arrays never accessed If these arrays have never been accessed we failed to derive an upper bound of the accesses and consequently a size for the outermost dimension. We now explicitly check for empty access sets and then just use zero as size for the outermost dimension. llvm-svn: 281165	2016-09-11 13:30:12 +00:00
Michael Kruse	7886bd7ca5	Add -polly-flatten-schedule pass. The -polly-flatten-schedule pass reduces the number of scattering dimensions in its isl_union_map form to make them easier to understand. It is not meant to be used in production, only for debugging and regression tests. To illustrate, how it can make sets simpler, here is a lifetime set used computed by the porposed DeLICM pass without flattening: { Stmt_reduction_for[0, 4] -> [0, 2, o2, o3] : o2 < 0; Stmt_reduction_for[0, 4] -> [0, 1, o2, o3] : o2 >= 5; Stmt_reduction_for[0, 4] -> [0, 1, 4, o3] : o3 > 0; Stmt_reduction_for[0, i1] -> [0, 1, i1, 1] : 0 <= i1 <= 3; Stmt_reduction_for[0, 4] -> [0, 2, 0, o3] : o3 <= 0 } And here the same lifetime for a semantically identical one-dimensional schedule: { Stmt_reduction_for[0, i1] -> [2 + 3i1] : 0 <= i1 <= 4 } Differential Revision: https://reviews.llvm.org/D24310 llvm-svn: 280948	2016-09-08 15:02:36 +00:00
Michael Kruse	564579726a	Add check-polly-tests build target. The check-polly-tests target runs regression/unit tests but without checking formatting. This is useful to not having to reload a file in an open editor (which eg. clears the undo buffer, moves cursor/window position) when running polly-update-format. After this change, the following test targets exist: - check-polly-unittests to run unittests only - check-polly-tests to run unit and regression tests - polly-check-format to check formatting using clang-format - check-polly to run them all As a side-effect, when running check-polly, polly-check-format and run in parallel (instead of polly-check-format first). Differential Revision: https://reviews.llvm.org/D24191 llvm-svn: 280654	2016-09-05 10:54:16 +00:00
Michael Kruse	2fa3519463	Allow mapping scalar MemoryAccesses to array elements. Change the code around setNewAccessRelation to allow to use a an existing array element for memory instead of an ad-hoc alloca. This facility will be used for DeLICM/DeGVN to convert scalar dependencies into regular ones. The changes necessary include: - Make the code generator use the implicit locations instead of the alloca ones. - A test case - Make the JScop importer accept changes of scalar accesses for that test case. - Adapt the MemoryAccess interface to the fact that the MemoryKind can change. They are named (get\|is)OriginalXXX() to get the status of the memory access before any change by setNewAccessRelation() (some properties such as getIncoming() do not change even if the kind is changed and are still required). To get the modified properties, there is (get\|is)LatestXXX(). The old accessors without Original\|Latest become synonyms of the (get\|is)OriginalXXX() to not make functional changes in unrelated code. Differential Revision: https://reviews.llvm.org/D23962 llvm-svn: 280408	2016-09-01 19:53:31 +00:00
Michael Kruse	d262feff80	Add space between access string and follow-up. llvm-svn: 279826	2016-08-26 15:43:52 +00:00
Michael Kruse	6b6e38d9b1	Add "New access function" to update_check.py classifier. Lines with this prefix are printed by JSONImporter. llvm-svn: 279825	2016-08-26 15:43:43 +00:00
Roman Gareev	44aeef7ecf	[FIX] Access dimensions should correspond to number of dimensions of the accesses array. llvm-svn: 279821	2016-08-26 13:41:53 +00:00
Michael Kruse	05cf9c22f1	Introduce unittests. Add the infrastructure for unittests to Polly and two simple tests for conversion between isl_val and APInt. In addition, a build target check-polly-unittests is added to run only the unittests but not the regression tests. Clang's unittest mechanism served as as a blueprint which then was adapted to Polly. Differential Revision: https://reviews.llvm.org/D23833 llvm-svn: 279734	2016-08-25 12:36:15 +00:00
Michael Kruse	0e63ab4243	Use configure_lit_site_cfg instead of configure_file. configure_lit_site_cfg defines some more parameters that are used in lit.site.cfg.in. configure_file would leave those empty. These additional definitions seem to be unimportant for regression tests, but unittests do not work without them. In case of out-of-tree builds, define the additional parameters with default values. These may not take all configuration parameters into account, as configure_lit_site_cfg would. llvm-svn: 279733	2016-08-25 12:03:33 +00:00
Michael Kruse	4a080de057	Add %loadPolly to test command line. Required for out-of-tree builds of Polly. llvm-svn: 279657	2016-08-24 19:12:48 +00:00
Roman Gareev	5f99f8656e	Add a flag to dump SCoP optimized with the IslScheduleOptimizer pass Dump polyhedral descriptions of Scops optimized with the isl scheduling optimizer and the set of post-scheduling transformations applied on the schedule tree to be able to check the work of the IslScheduleOptimizer pass at the polyhedral level. Reviewed-by: Tobias Grosser <tobias@grosser.es> Differential Revision: https://reviews.llvm.org/D23740 llvm-svn: 279395	2016-08-21 11:20:39 +00:00
Eli Friedman	28671c83d6	[SCEVValidator] Don't reorder multiplies in extractConstantFactor. The existing code would add the operands in the wrong order, and eventually crash because the SCEV expression doesn't exactly match the parameter SCEV expression in SCEVAffinator::visit. (SCEV doesn't sort the operands to getMulExpr in general.) Differential Revision: https://reviews.llvm.org/D23592 llvm-svn: 279087	2016-08-18 16:30:42 +00:00
Tobias Grosser	1c18440958	[BlockGenerator] Invalidate SCEV values for instructions in scop We already invalidated a couple of critical values earlier on, but we now invalidate all instructions contained in a scop after the scop has been code generated. This is necessary as later scops may otherwise obtain SCEV expressions that reference values in the earlier scop that before dominated the later scop, but which had been moved into the conditional branch and consequently do not dominate the later scop any more. If these very values are then used during code generation of the later scop, we generate used that are dominated by the values they use. This fixes: http://llvm.org/PR28984 llvm-svn: 279047	2016-08-18 10:45:57 +00:00
Tobias Grosser	b143e31164	[ScopInfo] Make scalars used by PHIs in non-affine regions available Normally this is ensured when adding PHI nodes, but as PHI node dependences do not need to be added in case all incoming blocks are within the same non-affine region, this was missed. This corrects an issue visible in LNT's sqlite3, in case invariant load hoisting was disabled. llvm-svn: 278792	2016-08-16 11:44:48 +00:00
Tobias Grosser	c80c15bd50	[ScopDetect] Do not assert in case of AddRecs with non-constant start expression llvm-svn: 278738	2016-08-15 20:59:30 +00:00
Tobias Grosser	13e55a32fd	[test] Force invariant load hoisting one last time Without invariant load hoisting an (unrelated) bug is exposed in this test case: http://llvm.org/PR28984 llvm-svn: 278680	2016-08-15 16:43:33 +00:00
Tobias Grosser	7cb809983d	[tests] Force invariant load hoisting for test cases that need it -- III llvm-svn: 278673	2016-08-15 15:56:24 +00:00
Tobias Grosser	ad61c170d5	[tests] Force invariant load hoisting for test cases that need it II llvm-svn: 278669	2016-08-15 13:58:16 +00:00
Tobias Grosser	75b9c7df4d	[test] Correct spelling in test case and explicitly enable invariant load hoisting for this test case. llvm-svn: 278668	2016-08-15 13:58:04 +00:00
Tobias Grosser	6e6264c142	[tests] Force invariant load hoisting for test cases that need it This will make it easier to switch the default of Polly's invariant load hoisting strategy and also makes it very clear that these test cases indeed require invariant code hoisting to work. llvm-svn: 278667	2016-08-15 13:27:49 +00:00
Roman Gareev	1c892e91e3	Perform replacement of access relations and creation of new arrays according to the packing transformation This is the third patch to apply the BLIS matmul optimization pattern on matmul kernels (http://www.cs.utexas.edu/users/flame/pubs/TOMS-BLIS-Analytical.pdf). BLIS implements gemm as three nested loops around a macro-kernel, plus two packing routines. The macro-kernel is implemented in terms of two additional loops around a micro-kernel. The micro-kernel is a loop around a rank-1 (i.e., outer product) update. In this change we perform replacement of the access relations and create empty arrays, which are steps to implement the packing transformation. In subsequent changes we will implement copying to created arrays. Reviewed-by: Tobias Grosser <tobias@grosser.es> Differential Revision: http://reviews.llvm.org/D22187 llvm-svn: 278666	2016-08-15 12:22:54 +00:00
Tobias Grosser	d58acf866a	[GPGPU] Ensure arrays where only parts are modified are copied to GPU To do so we change the way array exents are computed. Instead of the precise set of memory locations accessed, we now compute the extent as the range between minimal and maximal address in the first dimension and the full extent defined by the sizes of the inner array dimensions. We also move the computation of the may_persist region after the construction of the arrays, as it relies on array information. Without arrays being constructed no useful information is computed at all. llvm-svn: 278212	2016-08-10 10:58:19 +00:00
Tobias Grosser	b06ff4574e	[GPGPU] Support PHI nodes used in GPU kernel Ensure the right scalar allocations are used as the host location of data transfers. For the device code, we clear the allocation cache before device code generation to be able to generate new device-specific allocation and we need to make sure to add back the old host allocations as soon as the device code generation is finished. llvm-svn: 278126	2016-08-09 15:35:06 +00:00
Tobias Grosser	750160e260	[GPGPU] Use separate basic block for GPU initialization code This increases the readability of the IR and also clarifies that the GPU inititialization is executed _after_ the scalar initialization which needs to before the code of the transformed scop is executed. Besides increased readability, the IR should not change. Specifically, I do not expect any changes in program semantics due to this patch. llvm-svn: 278125	2016-08-09 15:35:03 +00:00
Tobias Grosser	776700d0b7	[BlockGenerator] Insert initializations at beginning of start block In case some code -- not guarded by control flow -- would be emitted directly in the start block, it may happen that this code would use uninitalized scalar values if the scalar initialization is only emitted at the end of the start block. This is not a problem today in normal Polly, as all statements are emitted in their own basic blocks, but Polly-ACC emits host-to-device copy statements into the start block. Additional Polly-ACC test coverage will be added in subsequent changes that improve the handling of PHI nodes in Polly-ACC. llvm-svn: 278124	2016-08-09 15:34:59 +00:00
Tobias Grosser	77f76788dc	[tests] Add two missing 'REQUIRES' lines llvm-svn: 278104	2016-08-09 09:11:39 +00:00
Tobias Grosser	c59b3ce044	[BlockGenerator] Also eliminate dead code not originating from BB After having generated the code for a ScopStmt, we run a simple dead-code elimination that drops all instructions that are known to be and remain unused. Until this change, we only considered instructions for dead-code elimination, if they have a corresponding instruction in the original BB that belongs to ScopStmt. However, when generating code we do not only copy code from the BB belonging to a ScopStmt, but also generate code for operands referenced from BB. After this change, we now also considers code for dead code elimination, which does not have a corresponding instruction in BB. This fixes a bug in Polly-ACC where such dead-code referenced CPU code from within a GPU kernel, which is possible as we do not guarantee that all variables that are used in known-dead-code are moved to the GPU. llvm-svn: 278103	2016-08-09 08:59:05 +00:00
Tobias Grosser	cf66ef26f3	[GPGPU] Pass parameters always by using their own type llvm-svn: 278100	2016-08-09 07:22:08 +00:00
Tobias Grosser	124534038a	[GPGPU] Support Values referenced from both isl expr and llvm instructions When adding code that avoids to pass values used in isl expressions and LLVM instructions twice, we forgot to make single variable passed to the kernel available in the ValueMap that makes it usable for instructions that are not replaced with isl ast expressions. This change adds the variable that is passed to the kernel to the ValueMap to ensure it is available for such use cases as well. llvm-svn: 278039	2016-08-08 19:22:19 +00:00
Tobias Grosser	cb1aef8de4	[GPGPU] Create code to verify run-time conditions llvm-svn: 278026	2016-08-08 17:35:55 +00:00

1 2 3 4 5 ...

955 Commits