llvm-project

Commit Graph

Author	SHA1	Message	Date
Tobias Grosser	718d04c653	Use isl::manage_copy to simplify calls to isl::manage(isl_.._copy()) As part of this cleanup a couple of unnecessary isl::manage(obj.copy()) pattern are eliminated as well. We checked for all potential cleanups by scanning for: "grep -R isl::manage\( lib/ \| grep copy" llvm-svn: 325558	2018-02-20 07:26:58 +00:00
Michael Kruse	163cacb469	[CodeGen] Detect empty domain because of parameters context. Isl does not allow generating isl_ast_expr from an isl_pw_aff that has an empty domain (i.e. has no pieces). We already detected the case if the isl_pw_aff comes with an empty domain. isl_ast_build also considers the domain empty if it is disjoint with the parameter context (e.g. parameters values that we exclude by runtime versioning). Intersect the access relation domain with the parameter context to also detect such practically empty access domains. The effective pointer used in the generated code is unimportand because it will never be executed. This fixes llvm.org/PR35362 llvm-svn: 318806	2017-11-21 22:11:10 +00:00
Michael Kruse	58166b13e0	Run polly-update-format. NFC. polly-check-format has been failing since at least r318517, due to more than one cause. llvm-svn: 318795	2017-11-21 19:25:26 +00:00
Philip Pfaffe	00fd43b327	Port ScopInfo to the isl cpp bindings Summary: Most changes are mechanical, but in one place I changed the program semantics by fixing a likely bug: In `Scop::hasFeasibleRuntimeContext()`, I'm now explicitely handling the error-case. Before, when the call to `addNonEmptyDomainConstraints()` returned a null set, this (probably) accidentally worked because isl_bool_error converts to true. I'm checking for nullptr now. Reviewers: grosser, Meinersbur, bollu Reviewed By: Meinersbur Subscribers: nemanjai, kbarton, pollydev, llvm-commits Differential Revision: https://reviews.llvm.org/D39971 llvm-svn: 318632	2017-11-19 22:13:34 +00:00
Michael Kruse	06618bf71a	[OpenMP] Fix reference collection of latest base ptrs. When collecting base pointers that need to be made available in parallel subfunctions, use the base pointer associated with the latest ScopArrayInfo, instead of the original one. llvm-svn: 316983	2017-10-31 10:28:22 +00:00
Philip Pfaffe	53c803871e	[Acc] Do not statically dispatch into IslNodeBuilder's createFor Summary: When GPUNodeBuilder creates loops inside the kernel, it dispatches to IslNodeBuilder. This however is surprisingly dangerous, since it accesses the AST Node's user through the wrong type. This patch fixes this problem by overriding createFor correctly. This fixes PR35010. Reviewers: grosser, bollu, Meinersbur Reviewed By: Meinersbur Subscribers: Meinersbur, nemanjai, pollydev, llvm-commits, kbarton Differential Revision: https://reviews.llvm.org/D39364 llvm-svn: 316872	2017-10-29 21:36:34 +00:00
Tobias Grosser	75d133f0ac	[IslExprBuilder] Do not generate RTC with more than 64 bit Such RTCs may introduce integer wrapping intrinsics with more than 64 bit, which are translated to library calls on AOSP that are not part of the runtime and will consequently cause linker errors. Thanks to Eli Friedman for reporting this issue and reducing the test case. llvm-svn: 314065	2017-09-23 15:32:07 +00:00
Siddharth Bhat	3928e3f50a	[ISLNodeBuilder] Materialize Fortran array sizes of arrays without memory accesses. In Polly, we specifically add a paramter to represent the outermost dimension size of fortran arrays. We do this because this information is statically available from the fortran metadata generated by dragonegg. However, we were only materializing these parameters (meaning, creating an llvm::Value to back the isl_id) from memory accesses. This is wrong, we should materialize parameters from scop array info. It is wrong because if there is a case where we detect 2 fortran arrays, but only one of them is accessed, we may not materialize the other array's dimensions at all. This is incorrect. We fix this by looping over all `polly::ScopArrayInfo` in a scop, rather that just all `polly::MemoryAccess`. Differential Revision: https://reviews.llvm.org/D37379 llvm-svn: 312350	2017-09-01 18:55:43 +00:00
Eugene Zelenko	9248fde53a	[Polly] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC). llvm-svn: 311704	2017-08-24 21:22:41 +00:00
Michael Kruse	06ed529205	Add more statistics. Add statistics about - Which optimizations are applied - Number of loops in Scops at various stages - Number of scalar/singleton writes at various stages representative for scalar false dependencies - Number of parallel loops These will be useful to find regressions due to moving Polly further down of LLVM's pass pipeline. Differential Revision: https://reviews.llvm.org/D37049 llvm-svn: 311553	2017-08-23 13:50:30 +00:00
Roman Gareev	6bfeba24d3	[NFC] Fix the broken comment. llvm-svn: 311477	2017-08-22 17:43:03 +00:00
Roman Gareev	0956a606ff	Disable the Loop Vectorizer in case of GEMM Currently, in case of GEMM and the pattern matching based optimizations, we use only the SLP Vectorizer out of two LLVM vectorizers. Since the Loop Vectorizer can get in the way of optimal code generation, we disable the Loop Vectorizer for the innermost loop using mark nodes and emitting the corresponding metadata. Reviewed-by: Tobias Grosser <tobias@grosser.es> Differential Revision: https://reviews.llvm.org/D36928 llvm-svn: 311473	2017-08-22 17:38:46 +00:00
Tobias Grosser	9f2eb24c06	Clarify the intend of the run-time check llvm-svn: 311243	2017-08-19 16:26:39 +00:00
Tobias Grosser	43df2020e7	[GPGPU] Collect parameter dimension used in MemoryAccesses When using -polly-ignore-integer-wrapping and -polly-acc-codegen-managed-memory we add parameter dimensions lazily to the domains, which results in PPCG not including parameter dimensions that are only used in memory accesses in the kernel space. To make sure these parameters are still passed to the kernel, we collect these parameter dimensions and align the kernel's parameter space before code-generating it. llvm-svn: 311239	2017-08-19 12:58:28 +00:00
Tobias Grosser	e2a45f32dc	[GPGPU] Also record invariant loads as kernel subtree values Before this change kernels that used invariant loads would have resulted in invalid PTX code. llvm-svn: 311042	2017-08-16 21:37:53 +00:00
Michael Kruse	40d083956c	[CodeGen] Use isLatestArrayKind(). Codegen with -polly-parallel queried the unmapped MemoryAccess, but only the MemoryKind after mapping is relevant for codegen. This should fix various fails of the perf-x86_64-penryn-O3-polly-parallel-fast buildbot. llvm-svn: 310466	2017-08-09 12:27:51 +00:00
Tobias Grosser	61bd3a4840	[ScopInfo] Move Scop::getPwAffOnly to isl++ [NFC] llvm-svn: 310231	2017-08-06 21:42:38 +00:00
Tobias Grosser	b65ccc4302	[ScopInfo] Translate Scop::getParamSpace to isl++ [NFC] llvm-svn: 310224	2017-08-06 20:11:59 +00:00
Tobias Grosser	8ea1fc19b3	[ScopInfo] Translate Scop::getContext to isl++ [NFC] llvm-svn: 310221	2017-08-06 19:52:38 +00:00
Tobias Grosser	9a63570b13	[ScopInfo] Translate Scop::getIdForParam to isl++ [NFC] llvm-svn: 310220	2017-08-06 19:31:27 +00:00
Tobias Grosser	132860afe5	[ScopInfo] Move ScopStmt::setAstBuild/getAstBuild to isl++ llvm-svn: 310216	2017-08-06 17:53:04 +00:00
Tobias Grosser	dcf8d696ff	Move ScopInfo::getDomain(), getDomainSpace(), getDomainId() to isl++ llvm-svn: 310209	2017-08-06 16:39:52 +00:00
Siddharth Bhat	e53c924b0f	[Polly] [PPCGCodeGeneration] Deal with loops outside the Scop correctly in PPCGCodeGeneration. A Scop with a loop outside it is not handled currently by PPCGCodeGeneration. The test case is such that the Scop has only one inner loop that is detected. This currently breaks codegen. The fix is to reuse the existing mechanism in `IslNodeBuilder` within `GPUNodeBuilder. Differential Revision: https://reviews.llvm.org/D36290 llvm-svn: 310193	2017-08-06 02:39:05 +00:00
Siddharth Bhat	0caed1fbe6	[IslNodeBuilder] [NFC] Refactor creation of loop induction variables of loops outside scops. This logic is duplicated, so we refactor it into a separate function. This will be used in a later patch to teach PPCGCodeGen code generation for loops that are outside the scop. Differential Revision: https://reviews.llvm.org/D36310 llvm-svn: 310192	2017-08-06 02:07:11 +00:00
Siddharth Bhat	f2cfd2a4db	[NFC] [IslNodeBuilder, GPUNodeBuilder] Unify mechanism for looking up replacement Values. We populate `IslNodeBuilder::ValueMap` which contains replacements for `llvm::Value`s. There was no simple method to pick up a replacement if it exists, otherwise fall back to the original. Create a method `IslNodeBuilder::getLatestValue` which provides this functionality. This will be used in a later patch to fix bugs in `PPCGCodeGeneration` where the latest value is not being used. Differential Revision: https://reviews.llvm.org/D36000 llvm-svn: 309674	2017-08-01 12:15:51 +00:00
Tobias Grosser	7639db8ed9	[IslNodeBuilder] Remove unused instruction Suggested-by: Maximilian Falkenstein <falkensm@student.ethz.ch> llvm-svn: 309533	2017-07-31 01:59:23 +00:00
Tobias Grosser	3b196131b5	Move applyScheduleToAccessRelation to isl++ llvm-svn: 308842	2017-07-23 04:08:52 +00:00
Tobias Grosser	6a87036e0f	Move MemoryAccess::getAddressFunction to isl++ llvm-svn: 308841	2017-07-23 04:08:45 +00:00
Tobias Grosser	1515f6b937	Move MemoryAccess::NewAccessRelation to isl++ We also move related accessor functions llvm-svn: 308840	2017-07-23 04:08:38 +00:00
Tobias Grosser	fe46c3ff3a	Move MemoryAccess::id to isl++ llvm-svn: 308836	2017-07-23 04:08:11 +00:00
Tobias Grosser	77eef90f50	Move ScopArrayInfo to isl++ This moves the full ScopArrayInfo class to isl++ llvm-svn: 308801	2017-07-21 23:07:56 +00:00
Tobias Grosser	1eeedf4829	[IslNodeBuilder] Relax complexity check in invariant loads and run it early When performing invariant load hoisting we check that invariant load expressions are not too complex. Up to this commit, we performed this check by counting the sum of dimensions in the access range as a very simple heuristic. This heuristic is a little too conservative, as it prevents hoisting for any scops with a very large number of parameters. Hence, we update the heuristic to only count existentially quantified dimensions and set dimensions. We expect this to still detect the problematic expressions in h264 because of which this check was originally introduced. For some unknown reason, this complexity check was originally committed in IslNodeBuilder. It really belongs in ScopInfo, as there is no point in optimizing a program which we could have known earlier cannot be code generated. The benefit of running the check early is that we can avoid to even hoist checks that are expensive to code generate as invariant loads. This can be seen in the changed tests, where we now indeed detect the scop, but just not invariant load hoist the complicated access. We also improve the formatting of the code, document it, and use isl++ to simplify expressions. llvm-svn: 308659	2017-07-20 19:55:19 +00:00
Siddharth Bhat	a1b2086a33	[Invariant Loads] Do not consider invariant loads to have dependences. We need to relax constraints on invariant loads so that they do not create fake RAW dependences. So, we do not consider invariant loads as scalar dependences in a region. During these changes, it turned out that we do not consider `llvm::Value` replacements correctly within `PPCGCodeGeneration` and `ISLNodeBuilder`. The replacements dictated by `ValueMap` were not being followed in all places. This was fixed in this commit. There is no clean way to decouple this change because this bug only seems to arise when the relaxed version of invariant load hoisting was enabled. Differential Revision: https://reviews.llvm.org/D35120 llvm-svn: 307907	2017-07-13 12:18:56 +00:00
Michael Kruse	b738ffa845	Heap allocation for new arrays. This patch aims to implement the option of allocating new arrays created by polly on heap instead of stack. To enable this option, a key named 'allocation' must be written in the imported json file with the value 'heap'. We need such a feature because in a next iteration, we will implement a mechanism of maximal static expansion which will need a way to allocate arrays on heap. Indeed, the expansion is very costly in terms of memory and doing the allocation on stack is not worth considering. The malloc and the free are added respectively at polly.start and polly.exiting such that there is no use-after-free (for instance in case of Scop in a loop) and such that all memory cells allocated with a malloc are free'd when we don't need them anymore. We also add : - In the class ScopArrayInfo, we add a boolean as member called IsOnHeap which represents the fact that the array in allocated on heap or not. - A new branch in the method allocateNewArrays in the ISLNodeBuilder for the case of heap allocation. allocateNewArrays now takes a BBPair containing polly.start and polly.exiting. allocateNewArrays takes this two blocks and add the malloc and free calls respectively to polly.start and polly.exiting. - As IntPtrTy for the malloc call, we use the DataLayout one. To do that, we have modified : - createScopArrayInfo and getOrCreateScopArrayInfo such that it returns a non-const SAI, in order to be able to call setIsOnHeap in the JSONImporter. - executeScopConditionnaly such that it return both start block and end block of the scop, because we need this two blocs to be able to add the malloc and the free calls at the right position. Differential Revision: https://reviews.llvm.org/D33688 llvm-svn: 306540	2017-06-28 13:02:43 +00:00
Michael Kruse	a6d48f59a1	Fix a lot of typos. NFC. llvm-svn: 304974	2017-06-08 12:06:15 +00:00
Michael Kruse	706f79ab14	[CodeGen] Support partial write accesses. Allow the BlockGenerator to generate memory writes that are not defined over the complete statement domain, but only over a subset of it. It generates a condition that evaluates to 1 if executing the subdomain, and only then execute the access. Only write accesses are supported. Read accesses would require a PHINode which has a value if the access is not executed. Partial write makes DeLICM able to apply mappings that are not defined over the entire domain (for instance, a branch that leaves a loop with a PHINode in its header; a MemoryKind::PHI write when leaving is never read by its PHI read). Differential Revision: https://reviews.llvm.org/D33255 llvm-svn: 303517	2017-05-21 22:46:57 +00:00
Siddharth Bhat	b7f68b8c9e	[Fortran Support] Materialize outermost dimension for Fortran array. - We use the outermost dimension of arrays since we need this information to generate GPU transfers. - In general, if we do not know the outermost dimension of the array (because the indexing expression is non-affine, for example) then we simply cannot generate transfer code. - However, for Fortran arrays, we can use the Fortran array representation which stores the dimensions of all arrays. - This patch uses the Fortran array representation to generate code that computes the outermost dimension size. Differential Revision: https://reviews.llvm.org/D32967 llvm-svn: 303429	2017-05-19 15:07:45 +00:00
Tobias Grosser	f3adab4c20	[Polly] Canonicalize arrays according to base-ptr equivalence class Summary: In case two arrays share base pointers in the same invariant load equivalence class, we canonicalize all memory accesses to the first of these arrays (according to their order in the equivalence class). This enables us to optimize kernels such as boost::ublas by ensuring that different references to the C array are interpreted as accesses to the same array. Before this change the runtime alias check for ublas would fail, as it would assume models of the C array with differing (but identically valued) base pointers would reference distinct regions of memory whereas the referenced memory regions were indeed identical. As part of this change we remove most of the MemoryAccess::getBaseAddr interface. We removed already all references to getBaseAddr in previous commits to ensure that no code relies on matching base pointers between memory accesses and scop arrays -- except for three remaining uses where we need the original base pointer. We document for these situations that MemoryAccess::getOriginalBaseAddr may return a base pointer that is distinct to the base pointer of the scop array referenced by this memory access. Reviewers: sebpop, Meinersbur, zinob, gareevroman, pollydev, huihuiz, efriedma, jdoerfert Reviewed By: Meinersbur Subscribers: etherzhhb Tags: #polly Differential Revision: https://reviews.llvm.org/D28518 llvm-svn: 302636	2017-05-10 10:59:58 +00:00
Hongbin Zheng	0f8f177682	[Polly] Do not introduce address space cast Do not introduce address space cast in IslNodeBuilder::preloadUnconditionally. Differential Revision: https://reviews.llvm.org/D32581 llvm-svn: 301519	2017-04-27 06:42:14 +00:00
Matt Arsenault	b3e30c32ce	Update for alloca construction changes llvm-svn: 299905	2017-04-11 00:12:58 +00:00
Philip Pfaffe	2d950f36ee	[Polly][NewPM] Pull references to the legacy PM interface from utilities and helpers Summary: A couple of the utilities used to analyze or build IR make explicit use of the legacy PM on their interface, to access analysis results. This patch removes the legacy PM from the interface, and just passes the required results directly. This shouldn't introduce any function changes, although the API technically allowed to obtain two different analysis results before, one passed by reference and one through the PM. I don't believe that was ever intended, however. Reviewers: grosser, Meinersbur Reviewed By: grosser Subscribers: nemanjai, pollydev, llvm-commits Tags: #polly Differential Revision: https://reviews.llvm.org/D31653 llvm-svn: 299423	2017-04-04 10:01:53 +00:00
Roman Gareev	cdfb57dc46	Introduce another level of metadata to distinguish non-aliasing accesses Introduce another level of alias metadata to distinguish the individual non-aliasing accesses that have inter iteration alias-free base pointers marked with "Inter iteration alias-free" mark nodes. It can be used to, for example, distinguish different stores (loads) produced by unrolling of the innermost loops and, subsequently, sink (hoist) them by LICM. Reviewed-by: Tobias Grosser <tobias@grosser.es> Differential Revision: https://reviews.llvm.org/D30606 llvm-svn: 298510	2017-03-22 14:25:24 +00:00
Roman Gareev	23df27682a	Map the new load to the base pointer of the invariant load hoisted load Map the new load to the base pointer of the invariant load hoisted load to be able to find the alias information for it. Reviewed-by: Tobias Grosser <tobias@grosser.es> Differential Revision: https://reviews.llvm.org/D30605 llvm-svn: 298507	2017-03-22 13:57:53 +00:00
Tobias Grosser	b28f86e9e6	[CodeGen] Remove need for all parameters to be in scop context for load hoisting. When not adding constraints on parameters using -polly-ignore-parameter-bounds, the context may not necessarily list all parameter dimensions. To support code generation in this situation, we now always iterate over the actual parameter list, rather than relying on the context to list all parameter dimensions. llvm-svn: 298197	2017-03-18 23:12:49 +00:00
Michael Kruse	52ab4943b4	Remove all references to PostDominators. NFC. Marking a pass as preserved is necessary if any Polly pass uses it, even if it is not preserved within the generated code. Not marking it would cause the the Polly pass chain to be interrupted. It is not used by any Polly pass anymore, hence we can remove all references to it. llvm-svn: 295983	2017-02-23 15:16:22 +00:00
Tobias Grosser	ff40087a6a	Update to recent formatting changes llvm-svn: 293756	2017-02-01 10:12:09 +00:00
Tobias Grosser	587f1f57ad	[Polly] [BlockGenerator] Unify ScalarMap and PhiOpsMap Instead of keeping two separate maps from Value to Allocas, one for MemoryType::Value and the other for MemoryType::PHI, we introduce a single map from ScopArrayInfo to the corresponding Alloca. This change is intended, both as a general simplification and cleanup, but also to reduce our use of MemoryAccess::getBaseAddr(). Moving away from using getBaseAddr() makes sure we have only a single place where the array (and its base pointer) for which we generate code for is specified, which means we can more easily introduce new access functions that use a different ScopArrayInfo as base. We already today experiment with modifiable access functions, so this change does not address a specific bug, but it just reduces the scope one needs to reason about. Another motivation for this patch is https://reviews.llvm.org/D28518, where memory accesses with different base pointers could possibly be mapped to a single ScopArrayInfo object. Such a mapping is currently not possible, as we currently generate alloca instructions according to the base addresses of the memory accesses, not according to the ScopArrayInfo object they belong to. By making allocas ScopArrayInfo specific, a mapping to a single ScopArrayInfo object will automatically mean that the same stack slot is used for these arrays. For D28518 this is not a problem, as only MemoryType::Array objects are mapping, but resolving this inconsistency will hopefully avoid confusion. llvm-svn: 293374	2017-01-28 07:42:10 +00:00
Tobias Grosser	e1ff0cf2eb	Relax assert when setting access functions with invariant base pointers Summary: Instead of forbidding such access functions completely, we verify that their base pointer has been hoisted and only assert in case the base pointer was not hoisted. I was trying for a little while to get a test case that ensures the assert is correctly fired in case of invariant load hoisting being disabled, but I could not find a good way to do so, as llvm-lit immediately aborts if a command yields a non-zero return value. As we do not generally test our asserts, not having a test case here seems OK. This resolves http://llvm.org/PR31494 Suggested-by: Michael Kruse <llvm@meinersbur.de> Reviewers: efriedma, jdoerfert, Meinersbur, gareevroman, sebpop, zinob, huihuiz, pollydev Reviewed By: Meinersbur Differential Revision: https://reviews.llvm.org/D28798 llvm-svn: 292213	2017-01-17 12:00:42 +00:00
Tobias Grosser	21a059af09	Adjust formatting to commit r292110 [NFC] llvm-svn: 292123	2017-01-16 14:08:10 +00:00
Roman Gareev	bd5c6039c6	Align newly created arrays to the first level cache line boundary Aligning data to cache lines boundaries helps to avoid overheads related to an access to it ([1]). This patch aligns newly created arrays and adds an option to specify the first level cache line size. By default we use 64 bytes, which is a typical cache-line size ([2]). In case of Intel Core i7-3820 SandyBridge and the following options, clang -O3 gemm.c -I utilities/ utilities/polybench.c -DPOLYBENCH_TIME -march=native -mllvm -polly -mllvm -polly-pattern-matching-based-opts=true -DPOLYBENCH_USE_SCALAR_LB -mllvm -polly-target-cache-level-associativity=8,8 -mllvm -polly-target-cache-level-sizes=32768,262144 -mllvm -polly-target-latency-vector-fma=8 it helps to improve the performance from 11.303 GFlops/sec (39,247% of theoretical peak) to 12.63 GFlops/sec (43,8542% of theoretical peak). Refs.: [1] - http://www.alexonlinux.com/aligned-vs-unaligned-memory-access [2] - http://igoro.com/archive/gallery-of-processor-cache-effects/ Differential Revision: https://reviews.llvm.org/D28020 Reviewed-by: Tobias Grosser <tobias@grosser.es> llvm-svn: 290253	2016-12-21 12:37:36 +00:00

1 2 3

147 Commits