llvm-project

Commit Graph

Author	SHA1	Message	Date
Michael Kruse	225d583825	Add warning for FORCE_STATIC libraries when using BUILD_SHARED_LIBS. We cannot built ISL as shared object because we build it with -fvisibility=hidden; The created shared object would have no accessible symbols. The reason it is built with -fvisibility=hidden is because opt/clang might load other libraries that have ISL embedded and whose' symbols would conflict with Polly's private ISL. This could happend with Draggonegg. In the future we might instead statically link PollyISL into the Polly shared object to avoid installing the static library. Requested-by: Vedran Miletic <vedran@miletic.net> See-also: llvm.org/PR27306 llvm-svn: 279737	2016-08-25 13:21:53 +00:00
Michael Kruse	05cf9c22f1	Introduce unittests. Add the infrastructure for unittests to Polly and two simple tests for conversion between isl_val and APInt. In addition, a build target check-polly-unittests is added to run only the unittests but not the regression tests. Clang's unittest mechanism served as as a blueprint which then was adapted to Polly. Differential Revision: https://reviews.llvm.org/D23833 llvm-svn: 279734	2016-08-25 12:36:15 +00:00
Michael Kruse	0e63ab4243	Use configure_lit_site_cfg instead of configure_file. configure_lit_site_cfg defines some more parameters that are used in lit.site.cfg.in. configure_file would leave those empty. These additional definitions seem to be unimportant for regression tests, but unittests do not work without them. In case of out-of-tree builds, define the additional parameters with default values. These may not take all configuration parameters into account, as configure_lit_site_cfg would. llvm-svn: 279733	2016-08-25 12:03:33 +00:00
Michael Kruse	17a8b791ae	Add LLVM libdir to library search path in out-of-tree builds. This previously was not required because in an out-of-tree build Polly would only build libraries (LLVMPolly, libPolly, libPollyISL, libPollyPPCG), but no executables where the libraries would be linked to. This will change when adding unittests in a follow-up commit. llvm-svn: 279730	2016-08-25 11:28:52 +00:00
Michael Kruse	941a692690	Also warn if llvm-lit is not available. The program 'llvm-lit', like 'not' and 'FileCheck' are necessary for running check-polly. Warn of any of the three is not in LLVM_INSTALL_ROOT/bin directory. llvm-svn: 279728	2016-08-25 10:35:22 +00:00
Michael Kruse	4a080de057	Add %loadPolly to test command line. Required for out-of-tree builds of Polly. llvm-svn: 279657	2016-08-24 19:12:48 +00:00
Tim Shen	12921aaa7b	Migrate from NodeType * to NodeRef. llvm-svn: 279488	2016-08-22 22:30:27 +00:00
Roman Gareev	5f99f8656e	Add a flag to dump SCoP optimized with the IslScheduleOptimizer pass Dump polyhedral descriptions of Scops optimized with the isl scheduling optimizer and the set of post-scheduling transformations applied on the schedule tree to be able to check the work of the IslScheduleOptimizer pass at the polyhedral level. Reviewed-by: Tobias Grosser <tobias@grosser.es> Differential Revision: https://reviews.llvm.org/D23740 llvm-svn: 279395	2016-08-21 11:20:39 +00:00
Roman Gareev	e2ee79afde	Simplify AccFuncMap to vector<> AccessFunctions getAccessFunctions() is dead code and the 'BB' argument of getOrCreateAccessFunctions() is not used. This patch deletes getAccessFunctions and transforms AccFuncMap into a std::vector<std::unique_ptr<MemoryAccess>> AccessFunctions. Reviewed-by: Tobias Grosser <tobias@grosser.es> Differential Revision: https://reviews.llvm.org/D23759 llvm-svn: 279394	2016-08-21 11:09:19 +00:00
Eli Friedman	28671c83d6	[SCEVValidator] Don't reorder multiplies in extractConstantFactor. The existing code would add the operands in the wrong order, and eventually crash because the SCEV expression doesn't exactly match the parameter SCEV expression in SCEVAffinator::visit. (SCEV doesn't sort the operands to getMulExpr in general.) Differential Revision: https://reviews.llvm.org/D23592 llvm-svn: 279087	2016-08-18 16:30:42 +00:00
Tobias Grosser	1c18440958	[BlockGenerator] Invalidate SCEV values for instructions in scop We already invalidated a couple of critical values earlier on, but we now invalidate all instructions contained in a scop after the scop has been code generated. This is necessary as later scops may otherwise obtain SCEV expressions that reference values in the earlier scop that before dominated the later scop, but which had been moved into the conditional branch and consequently do not dominate the later scop any more. If these very values are then used during code generation of the later scop, we generate used that are dominated by the values they use. This fixes: http://llvm.org/PR28984 llvm-svn: 279047	2016-08-18 10:45:57 +00:00
Michael Kruse	ffb3278e27	Update ISL to isl-0.17.1-200-gd8de4ea. This version fixes a bug in set coalescing. llvm-svn: 278936	2016-08-17 15:24:45 +00:00
Tobias Grosser	b143e31164	[ScopInfo] Make scalars used by PHIs in non-affine regions available Normally this is ensured when adding PHI nodes, but as PHI node dependences do not need to be added in case all incoming blocks are within the same non-affine region, this was missed. This corrects an issue visible in LNT's sqlite3, in case invariant load hoisting was disabled. llvm-svn: 278792	2016-08-16 11:44:48 +00:00
Tobias Grosser	c80c15bd50	[ScopDetect] Do not assert in case of AddRecs with non-constant start expression llvm-svn: 278738	2016-08-15 20:59:30 +00:00
Tobias Grosser	74814e1a07	Disable invariant load hoisting temporarily With invariant load hoisting enabled the LLVM buildbots currently show some miscompiles, which are possibly caused by invariant load hosting itself. Confirming and fixing this requires a more in-depth analysis. To meanwhile get back green buildbots that allow us to observe other regressions, we disable invariant code hoisting temporarily. The relevant bug is tracked at: http://llvm.org/PR28985 llvm-svn: 278681	2016-08-15 16:43:36 +00:00
Tobias Grosser	13e55a32fd	[test] Force invariant load hoisting one last time Without invariant load hoisting an (unrelated) bug is exposed in this test case: http://llvm.org/PR28984 llvm-svn: 278680	2016-08-15 16:43:33 +00:00
Tobias Grosser	7cb809983d	[tests] Force invariant load hoisting for test cases that need it -- III llvm-svn: 278673	2016-08-15 15:56:24 +00:00
Tobias Grosser	ad61c170d5	[tests] Force invariant load hoisting for test cases that need it II llvm-svn: 278669	2016-08-15 13:58:16 +00:00
Tobias Grosser	75b9c7df4d	[test] Correct spelling in test case and explicitly enable invariant load hoisting for this test case. llvm-svn: 278668	2016-08-15 13:58:04 +00:00
Tobias Grosser	6e6264c142	[tests] Force invariant load hoisting for test cases that need it This will make it easier to switch the default of Polly's invariant load hoisting strategy and also makes it very clear that these test cases indeed require invariant code hoisting to work. llvm-svn: 278667	2016-08-15 13:27:49 +00:00
Roman Gareev	1c892e91e3	Perform replacement of access relations and creation of new arrays according to the packing transformation This is the third patch to apply the BLIS matmul optimization pattern on matmul kernels (http://www.cs.utexas.edu/users/flame/pubs/TOMS-BLIS-Analytical.pdf). BLIS implements gemm as three nested loops around a macro-kernel, plus two packing routines. The macro-kernel is implemented in terms of two additional loops around a micro-kernel. The micro-kernel is a loop around a rank-1 (i.e., outer product) update. In this change we perform replacement of the access relations and create empty arrays, which are steps to implement the packing transformation. In subsequent changes we will implement copying to created arrays. Reviewed-by: Tobias Grosser <tobias@grosser.es> Differential Revision: http://reviews.llvm.org/D22187 llvm-svn: 278666	2016-08-15 12:22:54 +00:00
Tobias Grosser	d58acf866a	[GPGPU] Ensure arrays where only parts are modified are copied to GPU To do so we change the way array exents are computed. Instead of the precise set of memory locations accessed, we now compute the extent as the range between minimal and maximal address in the first dimension and the full extent defined by the sizes of the inner array dimensions. We also move the computation of the may_persist region after the construction of the arrays, as it relies on array information. Without arrays being constructed no useful information is computed at all. llvm-svn: 278212	2016-08-10 10:58:19 +00:00
Mandeep Singh Grang	8c5314479e	Fix spacing around variable initializations and for-loops. NFC. Reviewers: grosser, _jdoerfert, zinob Projects: #polly Differential Revision: https://reviews.llvm.org/D23285 llvm-svn: 278143	2016-08-09 17:49:24 +00:00
Tobias Grosser	b06ff4574e	[GPGPU] Support PHI nodes used in GPU kernel Ensure the right scalar allocations are used as the host location of data transfers. For the device code, we clear the allocation cache before device code generation to be able to generate new device-specific allocation and we need to make sure to add back the old host allocations as soon as the device code generation is finished. llvm-svn: 278126	2016-08-09 15:35:06 +00:00
Tobias Grosser	750160e260	[GPGPU] Use separate basic block for GPU initialization code This increases the readability of the IR and also clarifies that the GPU inititialization is executed _after_ the scalar initialization which needs to before the code of the transformed scop is executed. Besides increased readability, the IR should not change. Specifically, I do not expect any changes in program semantics due to this patch. llvm-svn: 278125	2016-08-09 15:35:03 +00:00
Tobias Grosser	776700d0b7	[BlockGenerator] Insert initializations at beginning of start block In case some code -- not guarded by control flow -- would be emitted directly in the start block, it may happen that this code would use uninitalized scalar values if the scalar initialization is only emitted at the end of the start block. This is not a problem today in normal Polly, as all statements are emitted in their own basic blocks, but Polly-ACC emits host-to-device copy statements into the start block. Additional Polly-ACC test coverage will be added in subsequent changes that improve the handling of PHI nodes in Polly-ACC. llvm-svn: 278124	2016-08-09 15:34:59 +00:00
Tobias Grosser	77f76788dc	[tests] Add two missing 'REQUIRES' lines llvm-svn: 278104	2016-08-09 09:11:39 +00:00
Tobias Grosser	c59b3ce044	[BlockGenerator] Also eliminate dead code not originating from BB After having generated the code for a ScopStmt, we run a simple dead-code elimination that drops all instructions that are known to be and remain unused. Until this change, we only considered instructions for dead-code elimination, if they have a corresponding instruction in the original BB that belongs to ScopStmt. However, when generating code we do not only copy code from the BB belonging to a ScopStmt, but also generate code for operands referenced from BB. After this change, we now also considers code for dead code elimination, which does not have a corresponding instruction in BB. This fixes a bug in Polly-ACC where such dead-code referenced CPU code from within a GPU kernel, which is possible as we do not guarantee that all variables that are used in known-dead-code are moved to the GPU. llvm-svn: 278103	2016-08-09 08:59:05 +00:00
Tobias Grosser	cf66ef26f3	[GPGPU] Pass parameters always by using their own type llvm-svn: 278100	2016-08-09 07:22:08 +00:00
Michael Kruse	a6cc0d3a2d	[ScopDetection] Remove unused DetectionContexts during expansion. The function expandRegion() frees Region* objects again when it determines that these are not valid SCoPs. However, the DetectionContext added to the DetectionContextMap still holds a reference. The validity is checked using the ValidRegions lookup table. When a new Region is added to that list, it might share the same address, such that the DetectionContext contains two Region* associations that are in ValidRegions, but that are unrelated and of which one has already been free. Also remove the DetectionContext when not a valid expansion. llvm-svn: 278062	2016-08-08 22:39:32 +00:00
Tobias Grosser	124534038a	[GPGPU] Support Values referenced from both isl expr and llvm instructions When adding code that avoids to pass values used in isl expressions and LLVM instructions twice, we forgot to make single variable passed to the kernel available in the ValueMap that makes it usable for instructions that are not replaced with isl ast expressions. This change adds the variable that is passed to the kernel to the ValueMap to ensure it is available for such use cases as well. llvm-svn: 278039	2016-08-08 19:22:19 +00:00
Tobias Grosser	cb1aef8de4	[GPGPU] Create code to verify run-time conditions llvm-svn: 278026	2016-08-08 17:35:55 +00:00
Tobias Grosser	fa9abd1f03	Fix compilation in 'asserts' mode llvm-svn: 278025	2016-08-08 17:35:52 +00:00
Tobias Grosser	0aa29532b7	[IslNodeBuilder] Move run-time check generation to NodeBuilder [NFC] This improves the structure of the code and allows us to reuse the runtime code generation in the PPCGCodeGeneration. llvm-svn: 278017	2016-08-08 15:41:52 +00:00
Tobias Grosser	219feac456	[CodeGeneration] Do not set insert position redundantly There is no need to reset the position of the builder, as we can just continue to insert code at the current position of the IRBuilder, which happens to be precisely the location we reset the builder to. llvm-svn: 278014	2016-08-08 15:25:50 +00:00
Tobias Grosser	000db70754	[IslNodeBuilder] Directly use the insert location of our Builder ... instead of adding instructions at the end of the basic block the builder is currently at. This makes it easier to reason about where IR is generated, as with the IRBuilder there is just a single location that specificies where IR is generated. llvm-svn: 278013	2016-08-08 15:25:46 +00:00
Michael Kruse	fbde435517	[CodeGen] Use MapVector instead of DenseMap. The map is iterated over when generating the values escaping the SCoP. The indeterministic iteration order of DenseMap causes the output IR to change at every compilation, adding noise to comparisons. Replace DenseMap by a MapVector to ensure the same iteration order at every compilation. llvm-svn: 277832	2016-08-05 16:45:51 +00:00
Michael Kruse	d82222fc1b	[DependenceInfo] Reset operations counter when setting limit. When entering the dependence computation and the max_operations is set, the operations counter may have already exceeded the counter, thus aborting any ISL computation from the start. The counter is reset at the end of the dependence calculation such that a follow-up recomputation might succeed, ie. the success of the first dependence calculation depends on unrelated ISL operations that happened before, giving it a disadvantage to the following calculations. This patch resets the operations counter at the beginning of the dependence recalculation to not depend on previous actions. Otherwise additional preprocessing of the Scop that aims to improve its schedulability (eg. DeLICM) do have the effect that DependenceInfo and hence the scheduling fail more likely, contraproductive to the goal of said preprocessing. llvm-svn: 277810	2016-08-05 11:31:02 +00:00
Tobias Grosser	928d7573dd	GPGPU: Sort dimension sizes of multi-dimensional shared memory arrays correctly Before this commit we generated the array type in reverse order and we also added the outermost dimension size to the new array declaration, which is incorrect as Polly additionally assumed an additional unsized outermost dimension, such that we had an off-by-one error in the linearization of access expressions. llvm-svn: 277802	2016-08-05 08:27:24 +00:00
Tobias Grosser	470608e3e4	Add missing 'REQUIRES' line llvm-svn: 277800	2016-08-05 07:08:45 +00:00
Tobias Grosser	c1c6a2a61b	GPGPU: Add cuda annotations to specify maximal number of threads per block These annotations ensure that the NVIDIA PTX assembler limits the number of registers used such that we can be certain the resulting kernel can be executed for the number of threads in a thread block that we are planning to use. llvm-svn: 277799	2016-08-05 06:47:43 +00:00
Tobias Grosser	f919d8b360	GPGPU: Support scalars that are mapped to shared memory llvm-svn: 277726	2016-08-04 13:57:29 +00:00
Tobias Grosser	8950cead7f	GPGPU: Disable verbose debug output llvm-svn: 277724	2016-08-04 12:44:03 +00:00
Tobias Grosser	b0dd95bcd2	Remove leftover debug output llvm-svn: 277723	2016-08-04 12:41:28 +00:00
Tobias Grosser	130ca30f92	GPGPU: Add private memory support llvm-svn: 277722	2016-08-04 12:39:03 +00:00
Tobias Grosser	b513b4916b	GPGPU: Add support for shared memory llvm-svn: 277721	2016-08-04 12:18:14 +00:00
Tobias Grosser	b187515784	GPGPU: Cache PTX kernels We always keep a number of already compiled kernels available to ensure to avoid costly recompilation. llvm-svn: 277707	2016-08-04 09:15:58 +00:00
Tobias Grosser	00bb5a99f5	GPGPU: Handle scalar array references Pass the content of scalar array references to the alloca on the kernel side and do not pass them additional as normal LLVM scalar value. llvm-svn: 277699	2016-08-04 06:55:59 +00:00
Tobias Grosser	3216f8546c	BlockGenerator: Assert that we do not get alloca of array access llvm-svn: 277698	2016-08-04 06:55:53 +00:00
Tobias Grosser	576932728d	GPGPU: Pass subtree values correctly to the kernel llvm-svn: 277697	2016-08-04 06:55:49 +00:00
Tobias Grosser	629109b633	GPGPU: Mark kernel functions as polly.skip Otherwise, we would try to re-optimize them with Polly-ACC and possibly even generate kernels that try to offload themselves, which does not work as the GPURuntime is not available on the accelerator and also does not make any sense. llvm-svn: 277589	2016-08-03 12:00:07 +00:00
Tobias Grosser	2219d15748	Fix a couple of spelling mistakes llvm-svn: 277569	2016-08-03 05:28:09 +00:00
Roman Gareev	0c09a3af00	Add missing prefixes. llvm-svn: 277264	2016-07-30 11:15:00 +00:00
Roman Gareev	d7754a1245	Extend the jscop interface to allow the user to declare new arrays and to reference these arrays from access expressions Extend the jscop interface to allow the user to export arrays. It is required that already existing arrays of the list of arrays correspond to arrays of the SCoP. Each array that is appended to the list will be newly created. Furthermore, we allow the user to modify access expressions to reference any array in case it has the same element type. Reviewed-by: Tobias Grosser <tobias@grosser.es> Differential Revision: https://reviews.llvm.org/D22828 llvm-svn: 277263	2016-07-30 09:25:51 +00:00
Tobias Grosser	8af38ecaa3	Add missing REQUIRES line llvm-svn: 276964	2016-07-28 07:08:34 +00:00
Tobias Grosser	d8b94bcac1	GPGPU: Pass context parameters to GPU kernel llvm-svn: 276963	2016-07-28 06:47:59 +00:00
Tobias Grosser	a490147c90	GPGPU: Pass host iterators to kernel llvm-svn: 276962	2016-07-28 06:47:56 +00:00
Tobias Grosser	44143bb927	GPGPU: use current 'Index' to find slot in parameter array Before this change we used the array index, which would result in us accessing the parameter array out-of-bounds. This bug was visible for test cases where not all arrays in a scop are passed to a given kernel. llvm-svn: 276961	2016-07-28 06:47:53 +00:00
Tobias Grosser	4e18d71c71	GPGPU: Generate kernel parameter allocation with right size Before this change we miscounted the number of function parameters. llvm-svn: 276960	2016-07-28 06:47:50 +00:00
Tobias Grosser	79a947c233	GPGPU: Add basic support for kernel launches llvm-svn: 276863	2016-07-27 13:20:16 +00:00
Tobias Grosser	5779359624	GPGPU: Load GPU kernels We embed the PTX code into the host IR as a global variable and compile it at run-time into a GPU kernel. llvm-svn: 276645	2016-07-25 16:31:21 +00:00
Johannes Doerfert	8031238017	[GSoC] Add PolyhedralInfo pass - new interface to polly analysis Adding a new pass PolyhedralInfo. This pass will be the interface to Polly. Initially, we will provide the following interface: - #IsParallel(Loop *L) - return a bool depending on whether the loop is parallel or not for the given program order. Patch by Utpal Bora <cs14mtech11017@iith.ac.in> Differential Revision: https://reviews.llvm.org/D21486 llvm-svn: 276637	2016-07-25 12:48:45 +00:00
Tobias Grosser	13c78e4d51	GPGPU: Emit data-transfer code Also factor out getArraySize() to avoid code dupliciation and reorder some function arguments to indicate the direction into which data is transferred. llvm-svn: 276636	2016-07-25 12:47:39 +00:00
Tobias Grosser	7287aeddf1	GPGPU: Complete code to allocate and free device arrays At the beginning of each SCoP, we allocate device arrays for all arrays used on the GPU and we free such arrays after the SCoP has been executed. llvm-svn: 276635	2016-07-25 12:47:33 +00:00
Tobias Grosser	19b8a0bbfb	GPURuntime: Add missing debug output llvm-svn: 276634	2016-07-25 12:47:28 +00:00
Tobias Grosser	9855e8bd80	GPURuntime: Fix typo in docu llvm-svn: 276633	2016-07-25 12:47:25 +00:00
Tobias Grosser	a71eedd4c5	GPURuntime: Drop polly_cleanupGPGPUResources This function is currently unused and won't be used in this form again. Instead of freeing many unrelated items at the same time, we will instead explicitly call free function from the host-IR we generate for each object we want to free. These specific free functions will be added together with the corresponding host-IR generation code. llvm-svn: 276632	2016-07-25 12:47:22 +00:00
Johannes Doerfert	3b7ac0a691	[GSoC] Do not process SCoPs with infeasible runtime context Do not process SCoPs with infeasible runtime context in the new ScopInfoWrapperPass. Do not compute dependences for such SCoPs in the new DependenceInfoWrapperPass. Patch by Utpal Bora <cs14mtech11017@iith.ac.in> Differential Revision: https://reviews.llvm.org/D22402 llvm-svn: 276631	2016-07-25 12:40:59 +00:00
Roman Gareev	3a18a931a8	Apply all necessary tilings and interchangings to get a macro-kernel This is the second patch to apply the BLIS matmul optimization pattern on matmul kernels (http://www.cs.utexas.edu/users/flame/pubs/TOMS-BLIS-Analytical.pdf). BLIS implements gemm as three nested loops around a macro-kernel, plus two packing routines. The macro-kernel is implemented in terms of two additional loops around a micro-kernel. The micro-kernel is a loop around a rank-1 (i.e., outer product) update. In this change we create the BLIS macro-kernel by applying a combination of tiling and interchanging. In subsequent changes we will implement the packing transformation. Reviewed-by: Tobias Grosser <tobias@grosser.es> Differential Revision: http://reviews.llvm.org/D21491 llvm-svn: 276627	2016-07-25 09:42:53 +00:00
Tobias Grosser	fa7b080218	GPGPU: initialize GPU context and simplify the corresponding GPURuntime interface. There is no need to expose the selected device at the moment. We also pass back pointers as return values, as this simplifies the interface. llvm-svn: 276623	2016-07-25 09:16:01 +00:00
Tobias Grosser	8ed5e5999f	IslNodeBuilder: Make finalize() virtual This allows the finalization routine of the IslNodeBuilder to be overwritten by derived classes. Being here, we also drop the unnecessary 'Scop' postfix and the unnecessary 'Scop' parameter. llvm-svn: 276622	2016-07-25 09:15:57 +00:00
Tobias Grosser	0a1a2720c8	GPURuntime: Check for debug-mode early on Before this change, the debug statements in polly_initDevice would all be skipped, as debug-mode would only be enabled _after_ they have already been run. llvm-svn: 276621	2016-07-25 09:15:53 +00:00
Tobias Grosser	dc816da455	GPURuntime: Drop timing functionality (some leftover II) llvm-svn: 276617	2016-07-25 08:03:08 +00:00
Roman Gareev	2cb4d133f5	[NFC] Refactor creation of the BLIS mirco-kernel and improve documentation Reviewed-by: Tobias Grosser <tobias@grosser.es> llvm-svn: 276616	2016-07-25 07:27:59 +00:00
Tobias Grosser	97aa23519e	GPURuntime: Drop timing functionality (some leftover) llvm-svn: 276612	2016-07-25 07:11:49 +00:00
Tobias Grosser	92713bea42	GPURuntime: Drop timing functionality This functionality won't be used in the current iteration. Drop it for now to reduce the surface of the library. We can always add it back in when we need it again. llvm-svn: 276611	2016-07-25 07:10:45 +00:00
Tobias Grosser	9a18d55947	GPGPU: Optimize kernel IR before generating assembly code We optimize the kernel _after_ dumping the IR we generate to make the IR we dump easier readable and independent of possible changes in the general purpose LLVM optimizers. llvm-svn: 276551	2016-07-24 06:43:21 +00:00
Tobias Grosser	e1a98343a1	GPGPU: Verify kernel IR before generating assembly llvm-svn: 276550	2016-07-24 06:43:17 +00:00
Michael Kruse	977d38bd87	Remove unused parameters from simplifySCoP(). NFC. llvm-svn: 276444	2016-07-22 17:31:17 +00:00
Tobias Grosser	74dc3cb431	GPGPU: Generate PTX assembly code for the kernel modules Run the NVPTX backend over the GPUModule IR and write the resulting assembly code in a string. To work correctly, it is important to invalidate analysis results that still reference the IR in the kernel module. Hence, this change clears all references to dominators, loop info, and scalar evolution. Finally, the NVPTX backend has troubles to generate code for various special floating point types (not surprising), but also for uncommon integer types. This commit does not resolve these issues, but pulls out problematic test cases into separate files to XFAIL them individually and resolve them in future (not immediate) changes one by one. llvm-svn: 276396	2016-07-22 07:11:12 +00:00
Tobias Grosser	edb885cb12	GPGPU: generate code for ScopStatements This change introduces the actual compute code in the GPU kernels. To ensure all values referenced from the statements in the GPU kernel are indeed available we scan all ScopStmts in the GPU kernel for references to llvm::Values that are not yet covered by already modeled outer loop iterators, parameters, or array base pointers and also pass these additional llvm::Values to the GPU kernel. For arrays used in the GPU kernel we introduce a new ScopArrayInfo object, which is referenced by the newly generated access functions within the GPU kernel and which is used to help with code generation. llvm-svn: 276270	2016-07-21 13:15:59 +00:00
Tobias Grosser	86083da0ec	IslNodeBuilder: expose addReferencesFromStmt [NFC] This will be used by Polly GPGPU to determine the values that need to be passed to GPU kernels. llvm-svn: 276269	2016-07-21 13:15:55 +00:00
Tobias Grosser	04b909fcca	IslExprBuilder: allow to specify an external isl_id to ScopArrayInfo mapping This is useful for external users using IslExprBuilder, in case they cannot embed ScopArrayInfo data into their isl_ids, because the isl_ids either already carry other information or the isl_ids have been created and their user pointers cannot be updated any more. llvm-svn: 276268	2016-07-21 13:15:51 +00:00
Tobias Grosser	9d12d8ade3	BlockGenerator: remove dead instructions in normal statements This ensures that no trivially dead code is generated. This is not only cleaner, but also avoids troubles in case code is generated in a separate function and some of this dead code contains references to values that are not available. This issue may happen, in case the memory access functions have been updated and old getelementptr instructions remain in the code. With normal Polly, a test case is difficult to draft, but the upcoming GPU code generation can possibly trigger such problems. We will later extend this dead-code elimination to region and vector statements. llvm-svn: 276263	2016-07-21 11:48:36 +00:00
Tobias Grosser	212469e0ed	tests: make test cases more robust using regexp llvm-svn: 276262	2016-07-21 11:48:31 +00:00
Tobias Grosser	903eefd1f2	tests: fix order of memory accesses to ensure import succeeds It seems the order in which we generated memory accesses changed such that the import of these updated memory accesses failed for the 'loop3' statement in this test case. Unfortunately, the existing CHECK lines were not strict enough to catch this. Hence, besides fixing the order of the memory access lines we also ensure that the memory access changes are both clearly visibly and well checked. llvm-svn: 276247	2016-07-21 07:12:17 +00:00
Tobias Grosser	9ea152714a	JScop: Factor out importContext [NFC] This makes the structure of the code clearer and reduces the size of runOnScop. We also adjust the coding style to the latest LLVM style guide. llvm-svn: 276246	2016-07-21 06:56:33 +00:00
Tobias Grosser	dbe34f7c58	JScop: Factor out importContext [NFC] This makes the structure of the code clearer and reduces the size of runOnScop. We also adjust the coding style to the latest LLVM style guide. llvm-svn: 276245	2016-07-21 06:56:31 +00:00
Tobias Grosser	c602d3bc84	JScop: Factor out importSchedule [NFC] This makes the structure of the code clearer and reduces the size of runOnScop. We also adjust the coding style to the latest LLVM style guide. llvm-svn: 276244	2016-07-21 06:56:28 +00:00
Tobias Grosser	9ec4f95234	Update isl to isl-0.17.1-191-g540b2fd This update resolves a bug in computing lexicographic minima/maxima. llvm-svn: 276138	2016-07-20 16:53:07 +00:00
Tobias Grosser	f533571fd2	Update isl to isl-0.17.1-171-g233f589 This fixes an issue with equality detection that resulted in an assertion being triggered during coalescing. llvm-svn: 276094	2016-07-20 07:52:42 +00:00
Tobias Grosser	2d58a64e7f	GPGPU: Bail out of scops with hoisted invariant loads This is currently not supported and will only be added later. Also update the test cases to ensure no invariant code hoisting is applied. llvm-svn: 275987	2016-07-19 15:56:25 +00:00
Tobias Grosser	22117a8913	GPGPU: Disable invariant load hoisting for GPU code generation This simplifies the upcoming patches to add code generation for ScopStmts. Load hoisting support will later be added in a separate commit. This commit will be implicitly tested by the subsequent GPGPU changes. llvm-svn: 275969	2016-07-19 11:13:58 +00:00
Tobias Grosser	f95c5cd06a	test: Add missing 'REQUIRES' line llvm-svn: 275962	2016-07-19 07:47:27 +00:00
Tobias Grosser	92852dbe78	test: Add missing 'REQUIRES' line llvm-svn: 275960	2016-07-19 07:39:54 +00:00
Tobias Grosser	5260c041ea	GPGPU: Emit in-kernel synchronization statements We use this opportunity to further classify the different user statements that can arise and add TODOs for the ones not yet implemented. llvm-svn: 275957	2016-07-19 07:33:16 +00:00
Tobias Grosser	59ab070523	GPGPU: generate control flow within the kernel llvm-svn: 275956	2016-07-19 07:33:11 +00:00
Tobias Grosser	c84a1995fe	GPGPU: add scop parameters to kernel arguments llvm-svn: 275955	2016-07-19 07:33:06 +00:00
Tobias Grosser	f6044bd0ef	GPGPU: add host iterators to kernel arguments llvm-svn: 275954	2016-07-19 07:32:55 +00:00
Tobias Grosser	472f9654c8	GPGPU: add intrinsic functions to obtain a kernels thread and block ids llvm-svn: 275953	2016-07-19 07:32:44 +00:00
Tobias Grosser	32837fe313	GPGPU: create kernel function skeleton Create for each kernel a separate LLVM-IR module containing a single function marked as kernel function and taking one pointer for each array referenced by this kernel. Add debugging output to verify the kernels are generated correctly. llvm-svn: 275952	2016-07-19 07:32:38 +00:00
Tobias Grosser	b9fc860a57	GPGPU: collect array references Initialize the list of references to a GPU array to ensure that the arrays that need to be passed to kernel calls are computed correctly. Furthermore, the very same information is also necessary to compute synchronization correctly. As the functionality to compute these references is already available, what is left for us to do is only to connect the necessary functionality to compute array reference information. llvm-svn: 275798	2016-07-18 15:44:32 +00:00
Tobias Grosser	1fb9b64dc0	GPGPU: Pull implementation out of class definition This will allow us to see the full class definition even after we add non-trivial implementations of the different member functions. llvm-svn: 275797	2016-07-18 15:44:25 +00:00
Tobias Grosser	05aad8dbcd	test: Add missing 'REQUIRES' line llvm-svn: 275784	2016-07-18 12:02:44 +00:00
Tobias Grosser	38fc0aed08	GPGPU: Create host control flow Create LLVM-IR for all host-side control flow of a given GPU AST. We implement this by introducing a new GPUNodeBuilder class derived from IslNodeBuilder. The IslNodeBuilder will take care of generating all general-purpose ast nodes, but we provide our own createUser implementation to handle the different GPU specific user statements. For now, we just skip any user statement and only generate a host-code sceleton, but in subsequent commits we will add handling of normal ScopStmt's performing computations, kernel calls, as well as host-device data transfers. We will also introduce run-time check generation and LICM in subsequent commits. llvm-svn: 275783	2016-07-18 11:56:39 +00:00
Tobias Grosser	cda19c230c	GPGPU: Abort if any dummy function is called This ensures that accidental calls to these functions will break loadly instead of corrupting the stack with invalid return values. These functions have been introduced earlier as replacement of pet and parts of ppcg which we will never use and consequently have not been imported or compiled into Polly. llvm-svn: 275680	2016-07-16 07:30:27 +00:00
Tobias Grosser	2025173494	GPGPU: Format statements scheduled on the host ourselves Otherwise ppcg would try to call into pet functionality that this not available, which obviously will cause trouble. As we can easily print these statements ourselves, we just do so. llvm-svn: 275579	2016-07-15 17:12:41 +00:00
Tobias Grosser	2341fe9e76	GPGPU: Use schedule whole components for scheduler This option increases the scalability of the scheduler and allows us to remove the 'gisting' workaround we introduced in r275565 to handle a more complicated test case. Another benefit of using this option is also that the generated code looks a lot more streamlined. Thanks to Sven Verdoolaege for reminding me of this option. llvm-svn: 275573	2016-07-15 16:15:47 +00:00
Tobias Grosser	e4725437e8	GPGPU: Drop domain constraints from flow dependences This works around a shortcoming of the isl scheduler, which even for some smaller test cases does not terminate in case domain constraints are part of the flow dependences. llvm-svn: 275565	2016-07-15 14:43:04 +00:00
Tobias Grosser	6293ba6973	GPGPU: Add memory reference tag ids to tagged accesses It seems we forgot to actually add the memory access ids to the tagged accesses, but instead just tagged the accesses with empty isl_ids. This issue was found by inspection and without code generation it is difficult to test just by itself. We fix it for now without test case and expect our code generation tests to cover this later on. llvm-svn: 275557	2016-07-15 12:44:27 +00:00
Tobias Grosser	cfa0361d35	GPGPU: Do not check for hidden declarations We do not have them in Polly and the code to check for them is directly referring to pet data structures which we do not have available. This commit avoids undefined behavior. As such issues are difficult to reproduce, this commit comes without a test case. llvm-svn: 275553	2016-07-15 11:42:53 +00:00
Tobias Grosser	225dca7838	GPGPU: Test scalar/array types i1/i3/i8/i32/i60/i64/i80/i120/i128/i3000 Arrays with integer base type are similar to arrays with floating point types, with the exception that LLVM's integer types can take some odd values. We add a selection of different values to make sure we correctly round these types when necessary. References to scalar integer types are special, as we currently do not model these types as array accesses as they are considered 'synthesizable' by Polly. As a result, we do not generate explicit data-transfers for them, but instead will need to keep track of all references to 'synthesizable' values separately. At the current stage, this is only visible by missing host-to-device data-transfer calls. In the future, we will also require special code generation strategies. llvm-svn: 275551	2016-07-15 11:33:47 +00:00
Tobias Grosser	8d9dcfc592	GPGPU: Test scalar parameters of type half/float/double/fp128/x86_fp80/ppc_fp128 We currently only test that the code structure we generate for these scalar parameters is correct and we add these types to make sure later code generation additions have sufficient test coverage. In case some of these types cannot be mapped due to missing hardware support on the GPU some of these test cases may need to be updated later on. llvm-svn: 275548	2016-07-15 11:12:29 +00:00
Tobias Grosser	2d010daf85	GPGPU: Make sure scops with more than one array work We use this opportunity to add a test case containing a scalar parameter. llvm-svn: 275547	2016-07-15 10:51:14 +00:00
Tobias Grosser	b307ed4d08	GPGPU: Free options to avoid memory leak ppcg does not free the option structs for us. To avoid a memory leak we do this ourselves. llvm-svn: 275546	2016-07-15 10:32:22 +00:00
Tobias Grosser	a56f8f8e58	GPGPU: Shorten ppcg include paths to avoid conflict with cuda.h Instead of directly linking to ppcg's main source directory, we link to the parent director. This allows us to access ppcg's include files with 'ppcg/cuda.h' and avoids a conflict with NVIDIA's cuda.h header. Also drop an include directory that is currently not used. llvm-svn: 275536	2016-07-15 07:50:36 +00:00
Tobias Grosser	60f63b49f2	GPGPU: Model array access information This allows us to derive host-device and device-host data-transfers. llvm-svn: 275535	2016-07-15 07:05:54 +00:00
Tobias Grosser	eeb8a95ac5	GPGPU: Use CHECK-NEXT to harden test cases A sequence of CHECK lines allows additional statements to appear in the output of the tested program without any test failures appearing. As we do not want this to happen, switch this test case to use CHECK-NEXT. llvm-svn: 275534	2016-07-15 07:05:49 +00:00
Tobias Grosser	69b4675180	GPGPU: Generate an AST for the GPU-mapped schedule For this we need to provide an explicit list of statements as they occur in the polly::Scop to ppcg. We also setup basic AST printing facilities to facilitate debugging. To allow code reuse some (minor) changes in ppcg are have been necessary. llvm-svn: 275436	2016-07-14 15:51:37 +00:00
Tobias Grosser	60c6002570	GPGPU: Add dummy implementation for ast expression construction Instead of calling to a pet function that does not return anything, we pass our own dummy implementation to ppcg that always returns a nullptr. This ensures that the list of ast expressions always contains a nullptr and we do not accidentally free a random (uninitalized) pointer. This resolves the last valgrind warning we see. We provide an implementation for this function, when the generated AST expressions can be used and consequently can be tested. llvm-svn: 275435	2016-07-14 15:51:32 +00:00
Tobias Grosser	4eaedde530	GPGPU: Use a tile size of 32 by default The tile size was previously uninitialized. As a result, it was often zero (aka. no tiling), which is not what we want in general. More importantly, there was the risk for arbitrary tile sizes to be choosen, which we did not observe, but which still is highly problematic. llvm-svn: 275418	2016-07-14 14:14:02 +00:00
Benjamin Kramer	56a46bc680	Upgrade all the .arcconfigs to https. llvm-svn: 275409	2016-07-14 13:15:37 +00:00
Tobias Grosser	bd81a7eebc	Fix formatting llvm-svn: 275397	2016-07-14 10:53:00 +00:00
Tobias Grosser	aef5196f75	GPGPU: Map initial schedule to GPU schedule This change now applies ppcg's GPU mapping on our initial schedule. For this to work, we need to also initialize the set of all names (isl_ids) used in the scop as well as the program context. llvm-svn: 275396	2016-07-14 10:51:52 +00:00
Tobias Grosser	681bd5688f	GPGPU: Do not dump schedule by default llvm-svn: 275395	2016-07-14 10:51:47 +00:00
Roman Gareev	6cf195b6d5	[NFC] Add full title/author information to "Apply the BLIS matmul optimization pattern" llvm-svn: 275392	2016-07-14 10:40:15 +00:00
Tobias Grosser	f384594d5e	GPGPU: compute new schedule from polly scop To do so we copy the necessary information to compute an initial schedule from polly::Scop to ppcg's scop. Most of the necessary information is directly available and only needs to be passed on to ppcg, with the exception of 'tagged' access relations, access relations that additionally carry information about which memory access an access relation originates from. We could possibly perform the construction of tagged accesses as part of ScopInfo, but as this format is currently specific to ppcg we do not do this yet, but keep this functionality local to our GPU code generation. After the scop has been initialized, we compute data dependences and ask ppcg to compute an initial schedule. Some of this functionality is already available in polly::DependenceInfo and polly::ScheduleOptimizer, but to keep differences to ppcg small we use ppcg's functionality here. We may later investiage if a closer integration of these tools makes sense. llvm-svn: 275390	2016-07-14 10:22:25 +00:00
Tobias Grosser	e938517e37	GPGPU: create default initialized PPCG scop and gpu program At this stage, we do not yet modify the IR but just generate a default initialized ppcg_scop and gpu_prog and free both immediately. Both will later be filled with data from the polly::Scop and are needed to use PPCG for GPU schedule generation. This commit does not yet perform any GPU code generation, but ensures that the basic infrastructure has been put in place. We also add a simple test case to ensure the new code is run and use this opportunity to verify that GPU_CODEGEN tests are only run if GPU code generation has been enabled in cmake. llvm-svn: 275389	2016-07-14 10:22:19 +00:00
Tobias Grosser	562d3aa80a	PPCGCodegen: Support compilation without GPU support llvm-svn: 275310	2016-07-13 19:52:24 +00:00
Tobias Grosser	9dfe4e7c05	Add accelerator code generation pass skeleton Add a new pass to serve as basis for automatic accelerator mapping in Polly. The pass structure and the analyses preserved are copied from CodeGeneration.cpp, as we will rely on IslNodeBuilder and IslExprBuilder for LLVM-IR code generation. Polly's accelerator code generation is enabled with -polly-target=gpu I would like to use this commit as opportunity to thank Yabin Hu for his work in the context of two Google summer of code projects during which he implemented initial prototypes of the Polly accelerator code generation -- in parts this code is already available in todays Polly (e.g., tools/GPURuntime). More will come as part of the upcoming Polly ACC changes. Reviewers: Meinersbur Subscribers: pollydev, llvm-commits Differential Revision: http://reviews.llvm.org/D22036 llvm-svn: 275275	2016-07-13 15:54:58 +00:00
Tobias Grosser	a041239bb7	Add ppcg-0.04 to lib/External ppcg will be used to provide mapping decisions for GPU code generation. As we do not use C as input language, we do not include pet. However, we include pet.h from pet 82cacb71 plus a set of dummy functions to ensure ppcg links without problems. The version of ppcg committed is unmodified ppcg-0.04 which has been well tested in the context of LLVM. It does not provide an official library interface yet, which means that in upcoming commits we will add minor modifications to make necessary functionality accessible. We will aim to upstream these modifications after we gained enough experience with GPU generation support in Polly to propose a stable interface. Reviewers: Meinersbur Subscribers: pollydev, llvm-commits Differential Revision: http://reviews.llvm.org/D22033 llvm-svn: 275274	2016-07-13 15:54:47 +00:00
Michael Kruse	3b0a9934fa	Add CHECK line to test case. NFC. Check not only that the compiler is not crashing, but also whether the probablematic part (The sequence of instructions simplified to '4') is reflected in the output. Thanks to Tobias for the hint. llvm-svn: 275189	2016-07-12 16:37:50 +00:00
Michael Kruse	e448364320	[SCEVAffinator] Fix assertion checking for constant divisor. An assertion in visitSDivInstruction() checked whether the divisor is constant by checking whether the argument is a ConstantInt. However, SCEVValidator allows the divisor to be simplified to a constant by ScalarEvolution. We synchronize the implementation of SCEVValidator and SCEVAffinator to both accept simplified SCEV expressions. llvm-svn: 275174	2016-07-12 15:08:47 +00:00
Weiming Zhao	7614e178cb	Fix a build warning of unhandled enum in switch Summary: LLVM adds a new value FMRB_DoesNotReadMemory in the enumeration. Reviewers: andrew.w.kaylor, chrisj, zinob, grosser, jdoerfert Subscribers: Meinersbur, pollydev Differential Revision: http://reviews.llvm.org/D22109 llvm-svn: 275085	2016-07-11 18:27:52 +00:00
Tobias Grosser	faef9a7667	Fix gcc compile failure Commit r275056 introduced a gcc compile failure due to us using two types named 'Type', the first being the newly introduced member variable 'Type' the second being llvm::Type. We resolve this issue by renaming the newly introduced member variable to AccessType. llvm-svn: 275057	2016-07-11 12:27:04 +00:00
Tobias Grosser	4e2d9c45b9	InvariantEquivClassTy: Use struct instead of 4-tuple to increase readability Summary: With a struct we can use named accessors instead of generic std::get<3>() calls. This increases readability of the source code. Reviewers: jdoerfert Subscribers: pollydev, llvm-commits Differential Revision: http://reviews.llvm.org/D21955 llvm-svn: 275056	2016-07-11 12:15:10 +00:00
Tobias Grosser	42eef3acd7	Add test case forgotten in r275053 llvm-svn: 275055	2016-07-11 12:15:06 +00:00
Tobias Grosser	5329277f81	load hoisting: compute memory access invalid context only for domain We now compute the invalid context of memory accesses only for the domain under which the memory access is executed. Without limiting ourselves to this restricted domain, invalid accesses outside of the domain of actually executed statement instances may result in the execution domain of the statement to become empty despite the fact that the statement will actually be executed. As a result, such scops would use unitialized values for their computations which results in incorrect computations. This fixes http://llvm.org/PR27944 and unbreaks the -polly-position=before-vectorizer buildbots. llvm-svn: 275053	2016-07-11 12:01:26 +00:00
Michael Kruse	586e579fe8	Fix assertion due to buildMemoryAccess. For llvm the memory accesses from nonaffine loops should be visible, however for polly those nonaffine loops should be invisible/boxed. This fixes llvm.org/PR28245 Cointributed-by: Huihui Zhang <huihuiz@codeaurora.org> Differential Revision: http://reviews.llvm.org/D21591 llvm-svn: 274842	2016-07-08 12:38:28 +00:00
Justin Bogner	e2467baba8	Update for llvm r274769 llvm-svn: 274777	2016-07-07 18:03:30 +00:00
Tobias Grosser	932ec01328	isl: isl-0.17.1-164-gcbba1b6 This is a regular maintenance update to ensure the latest version of isl is tested. Interesting Changes: - AST nodes and expressions are now printed as YAML llvm-svn: 274614	2016-07-06 09:11:00 +00:00
Tobias Grosser	7945b16d65	test: Drop unnecessary -polly-code-generator=isl flag isl is already the default code generator since we switched from CLooG several years ago. llvm-svn: 274609	2016-07-06 07:02:22 +00:00
Tobias Grosser	91990ab3ac	GPURuntime: Only print status in debug mode This change moves all status messages that are printed in non-error mode behind the POLLY_DEBUG flag. llvm-svn: 274598	2016-07-06 03:04:53 +00:00
Tobias Grosser	856e31bb9c	GPURuntime: Drop polly_allocateMemoryForHostAndDevice There is function is currently unused and will be replaced in the future by functions that allow to allocate memory only on the host or only on the device. llvm-svn: 274597	2016-07-06 03:04:50 +00:00
Tobias Grosser	a24d3ba26a	GPURuntime: Add basic debug tracing infrastructure When setting the POLLY_DEBUG environment variable, on calls to the run-time library the name of the function called is printed to stderr. llvm-svn: 274596	2016-07-06 03:04:47 +00:00
George Burgess IV	1a046de897	Try to fix polly buildbots. Broken by r274589. llvm-svn: 274595	2016-07-06 02:21:00 +00:00
Tobias Grosser	d1e90f5929	cmake: do not check-format anything in lib/External There is no need to specifically match for isl, but we can exclude anything in lib/External from formatting as we assume that externally contributed code should always match the upstream code. This simplifies the cmake script and allows additional external projects to be added without the need to explicitly exclude them from formatting. llvm-svn: 274557	2016-07-05 15:26:33 +00:00
Tobias Grosser	270cf12b3b	Correct two typos llvm-svn: 274430	2016-07-02 09:19:54 +00:00
Tobias Grosser	29a4dd92b7	CodegenCleanup: Drop CFLAA pass from codegen cleanup sequence Since r274197 -polly-position=before-vectorizer caused various LNT failures for example in SingleSource/Benchmarks/Linpack. These failures seem to only occur when the CFLAA pass is scheduled in our codegen-cleanup passes, which suggests that the way we call this AA pass is somehow problematic. As this pass is not of high importance, we drop the pass for now to prevent these failures from happening. At a later point, we might investigate more in-depth why this specific usage scenario caused correctness issues. llvm-svn: 274427	2016-07-02 07:58:13 +00:00
Tobias Grosser	2ea7c6e8d1	Ensure parameter names are isl-compatible Without this change it is not possible for isl to parse the resulting objects from their string representation. llvm-svn: 274350	2016-07-01 13:40:28 +00:00

1 2 3 4 5 ...

2723 Commits