llvm-project

Commit Graph

Author	SHA1	Message	Date
Johannes Doerfert	81aa6e882f	[NFC] Adjust naming scheme of statistic variables Suggested-by: Tobias Grosser <tobias@grosser.es> llvm-svn: 287347	2016-11-18 14:37:08 +00:00
Johannes Doerfert	6cd59e9076	Probably overwritten loads should not be considered hoistable Do not assume a load to be hoistable/invariant if the pointer is used by another instruction in the SCoP that might write to memory and that is always executed. llvm-svn: 287272	2016-11-17 22:25:17 +00:00
Johannes Doerfert	50dfbc572a	[NFC] Add flag to disable error block assumptions The declaration as an "error block" is currently aggressive and not very smart. This patch allows to disable error blocks completely. This might be useful to prevent SCoP expansion to a point where the assumed context becomes infeasible, thus the SCoP has to be discarded. llvm-svn: 287271	2016-11-17 22:16:35 +00:00
Johannes Doerfert	c97654681e	[FIX] Do not try to hoist memory intrinsic Since we do not necessarily treat memory intrinsics as non-affine anymore, we have to check for them explicitly before we try to hoist an access. llvm-svn: 287270	2016-11-17 22:11:56 +00:00
Johannes Doerfert	b3265a3612	[NFC] Skip over trivial assumptions Filter trivial assumptions, thus assume { : } or restrict { : 0 = 1 }, as they clutter the user output as well as the statistics. llvm-svn: 287269	2016-11-17 22:08:40 +00:00
Johannes Doerfert	dae2e9287d	[DBG] Collect statistics about actually versioned SCoPs llvm-svn: 287267	2016-11-17 21:55:43 +00:00
Johannes Doerfert	8c5464a715	[DBG] Allow to emit the RTC value at runtime The new command line flag "polly-codegen-emit-rtc-print" can be used to place a "printf" in the generated code that will print the RTC value and the overflow state. llvm-svn: 287265	2016-11-17 21:49:19 +00:00
Johannes Doerfert	cfadb2293f	[DBG] Collect statistics about statically infeasible SCoPs llvm-svn: 287263	2016-11-17 21:44:47 +00:00
Johannes Doerfert	cd195326bf	[DBG] Collect statistics about taken assumptions llvm-svn: 287261	2016-11-17 21:41:08 +00:00
Tobias Grosser	06e1592663	Update to isl-0.17.1-267-gbf9723d This update corrects an incorrect generation of min/max expressions in the isl AST generator and a problematic nullptr dereference. llvm-svn: 287098	2016-11-16 11:06:47 +00:00
Tobias Grosser	26be8e99b6	[ScopBuilder] Drop unnecessary namespace identifiers [NFC] llvm-svn: 286781	2016-11-13 21:28:13 +00:00
Tobias Grosser	5743e8de86	[SCEVAffinator] Do not scan redundantly for parameters In r286430 "SCEVValidator: add new parameters resulting from constant extraction" we added functionality to scan for parameters after constant extraction has taken place to ensure newly created parameters are correctly registered. This addition made the already existing registration of parameters redundant. Hence, we remove the corresponding call in this commit. An alternative solution would have been to also perform constant extraction when validating SCEV expressions and to then scan for parameters when validating a SCEV expression. However, as SCEV validation is used during SCoP detection where we want to be especially fast, adding additional functionality on this hot path should be avoided if good alternatives exist. In this case, we can choose to continue to only transform SCEV expression when actually modeling them. As all transformations we perform are expected to not change the validity of the SCEV expressions, this solution seems preferable. Suggested-by: Eli Friedman <efriedma@codeaurora.org> llvm-svn: 286780	2016-11-13 21:28:07 +00:00
Tobias Grosser	70d2709b1a	[ScopDetect] Conservatively handle inaccessible memory alias attributes Commit r286294 introduced support for inaccessiblememonly and inaccessiblemem_or_argmemonly attributes to BasicAA, which we need to support to avoid undefined behavior. This change just refuses all calls which are annotated with these attributes, which is conservatively correct. In the future we may consider to model and support such function calls in Polly. llvm-svn: 286771	2016-11-13 19:27:24 +00:00
Tobias Grosser	a9cac6a732	[tests] Adjust test output to recent changed SCEV canonocalization [NFC] LLVM recently changed the SCEV canonicalization which changed the output of one of our GPGPU test cases. llvm-svn: 286770	2016-11-13 19:27:17 +00:00
Tobias Grosser	a2f8fa33aa	[ScopDetect] Evaluate and verify branches at branch condition, not icmp The validity of a branch condition must be verified at the location of the branch (the branch instruction), not the location of the icmp that is used in the branch instruction. When verifying at the wrong location, we may accept an icmp that is defined within a loop which itself dominates, but does not contain the branch instruction. Such loops cannot be modeled as we only introduce domain dimensions for surrounding loops. To address this problem we change the scop detection to evaluate and verify SCEV expressions at the right location. This issue has been around since at least r179148 "scop detection: properly instantiate SCEVs to the place where they are used", where we explicitly set the scope to the wrong location. Before this commit the scope was not explicitly set, which probably also resulted in the scope around the ICmp to be choosen. This resolves http://llvm.org/PR30989 Reported-by: Eli Friedman <efriedma@codeaurora.org> llvm-svn: 286769	2016-11-13 19:27:04 +00:00
Tobias Grosser	f67433abd9	SCEVAffinator: pass parameter-only set to addRestriction if BB=nullptr Assumptions can either be added for a given basic block, in which case the set describing the assumptions is expected to match the dimensions of its domain. In case no basic block is provided a parameter-only set is expected to describe the assumption. The piecewise expressions that are generated by the SCEVAffinator sometimes have a zero-dimensional domain (e.g., [p] -> { [] : p <= -129 or p >= 128 }), which looks similar to a parameter-only domain, but is still a set domain. This change adds an assert that checks that we always pass parameter domains to addAssumptions if BB is empty to make mismatches here fail early. We also change visitTruncExpr to always convert to parameter sets, if BB is null. This change resolves http://llvm.org/PR30941 Another alternative to this change would have been to inspect all code to make sure we directly generate in the SCEV affinator parameter sets in case of empty domains. However, this would likely complicate the code which combines parameter and non-parameter domains when constructing a statement domain. We might still consider doing this at some point, but as this likely requires several non-local changes this should probably be done as a separate refactoring. Reported-by: Eli Friedman <efriedma@codeaurora.org> llvm-svn: 286444	2016-11-10 11:44:10 +00:00
Tobias Grosser	d0b9173caa	IslAst: always use the context during ast generation Providing the context to the ast generator allows for additional simplifcations and -- more importantly -- allows to generate loops with only partially bounded domains, assuming the domains are bounded for all parameter configurations that are valid as defined by the context. This change fixes the crash reported in http://llvm.org/PR30956 The original reason why we did not include the context when generating an AST was that CLooG and later isl used to sometimes transfer some of the constraints that bound the size of parameters from the context into the generated AST. This resulted in operations with very large constants, which sometimes introduced problematic integer overflows. The latest versions of the isl AST generator are careful to not introduce such constants. Reported-by: Eli Friedman <efriedma@codeaurora.org> llvm-svn: 286442	2016-11-10 09:39:58 +00:00
Tobias Grosser	4d543d654a	SCEVValidator: add new parameters resulting from constant extraction When extracting constant expressions out of SCEVs, new parameters may be introduced, which have not been registered before. This change scans SCEV expressions after constant extraction again to make sure newly introduced parameters are registered. We may for example extract the constant '8' from the expression '((8 * ((%a * %b) + %c)) + (-8 * %a))' and obtain the expression '(((-1 + %b) * %a) + %c)'. The new expression has a new parameter '(-1 + %b) * %a)', which was not registered before, but must be registered to not crash. This closes http://llvm.org/PR30953 Reported-by: Eli Friedman <efriedma@codeaurora.org> llvm-svn: 286430	2016-11-10 06:45:28 +00:00
Tobias Grosser	bbaeda3fe5	Do not allow switch statements in loop latches In r248701 "Allow switch instructions in SCoPs" support for switch statements has been introduced, but support for switch statements in loop latches was incomplete. This change completely disables switch statements in loop latches. The original commit changed addLoopBoundsToHeaderDomain to support non-branch terminator instructions, but this change was incorrect: it added a check for BI != null to the if-branch of a condition, but BI was used in the else branch es well. As a result, when a non-branch terminator instruction is encounted a nullptr dereference is triggered. Due to missing test coverage, this bug was overlooked. r249273 "[FIX] Approximate non-affine loops correctly" added code to disallow switch statements for non-affine loops, if they appear in either a loop latch or a loop exit. We adapt this code to now prohibit switch statements in loop latches even if the control condition is affine. We could possibly add support for switch statements in loop latches, but such support should be evaluated and tested separately. This fixes llvm.org/PR30952 Reported-by: Eli Friedman <efriedma@codeaurora.org> llvm-svn: 286426	2016-11-10 05:20:29 +00:00
Tobias Grosser	eba86a1208	ScopInfo: only run code needed for ASSERT in DEBUG mode Suggested-by: Johannes Doerfert llvm-svn: 286338	2016-11-09 04:24:49 +00:00
Tobias Grosser	a8ca3ed06a	SCEVValidator: reduce indentation to increase readability [NFC] llvm-svn: 286217	2016-11-08 07:17:48 +00:00
Tobias Grosser	16480186f8	IslNodeBuilder: Ensure newly generated memory accesses are well-defined Add some additional asserts that ensure newly code-generated memory accesses are defined on all domain and schedule domain instances. llvm-svn: 286050	2016-11-05 21:46:01 +00:00
Tobias Grosser	744740ad91	ScopInfo: Ensure copy statement memory accesses are correct Add asserts that verify that the memory accesses of a new copy statement are defined for all domain instances the copy statement is defined for. llvm-svn: 286047	2016-11-05 21:02:43 +00:00
Tobias Grosser	9321e208e8	Update isl to isl-0.17.1-243-g24c0339 This introduces big-endian support in imath and resolves http://llvm.org/PR24632. llvm-svn: 285993	2016-11-04 11:56:48 +00:00
Hongbin Zheng	ada8544dfb	Remove POLLY_LINK_LIBS, it is not used llvm-svn: 285976	2016-11-04 00:32:32 +00:00
Michael Kruse	e1dc387731	[ScopInfo] Fix isl object leak. Fix return from function without releasing isl objects, which was introduced in r269055. llvm-svn: 285924	2016-11-03 15:19:41 +00:00
Eli Friedman	acf8006471	[Polly CodeGen] Break critical edge from RTC to original loop. This makes polly generate a CFG which is closer to what we want in LLVM IR, with a loop preheader for the original loop. This is just a cleanup, but it exposes some fragile assumptions. I'm not completely happy with the changes related to expandCodeFor; RTCBB->getTerminator() is basically a random insertion point which happens to work due to the way we generate runtime checks. I'm not sure what the right answer looks like, though. Differential Revision: https://reviews.llvm.org/D26053 llvm-svn: 285864	2016-11-02 22:32:23 +00:00
Eli Friedman	b9c6f01a81	[ScopInfo] Make memset etc. affine where possible. We don't actually check whether a MemoryAccess is affine in very many places, but one important one is in checks for aliasing. Differential Revision: https://reviews.llvm.org/D25706 llvm-svn: 285746	2016-11-01 20:53:11 +00:00
Eli Friedman	6768285dcc	Add missing test from r284848. Original commit title: [SCEVAffinator] Make precise modular math more correct. llvm-svn: 285745	2016-11-01 20:45:28 +00:00
Tobias Grosser	ebb626e4b7	[ScopDetect] Use SCEVRewriteVisitor to simplify SCEVRemoveSMax rewriter ScalarEvolution got at some pointer a SCEVRewriteVisitor. Use it to simplify our SCEVRemoveSMax visitor. llvm-svn: 285491	2016-10-29 06:19:34 +00:00
Michael Kruse	426e6f71f8	[ScopInfo] Fix: use raw source pointer. When adding an llvm.memcpy instruction to AliasSetTracker, it uses the raw source and target pointers which preserve bitcasts. MemAccInst::getPointerOperand() also returns the raw target pointers, but Scop::buildAliasGroups() did not for the source pointer. This lead to mismatches between AliasSetTracker and ScopInfo on which pointer to use. Fixed by also using raw pointers in Scop::buildAliasGroups(). llvm-svn: 285071	2016-10-25 13:37:43 +00:00
Mandeep Singh Grang	3642dd3ca4	[polly] Change SmallPtrSet which is being iterated to SmallSetVector in ScopInfo.h Summary: This will avoid non-deterministic iteration order. Reviewers: grosser, jdoerfert, zinob, mgrang Subscribers: #polly Tags: #polly Differential Revision: https://reviews.llvm.org/D25880 llvm-svn: 284883	2016-10-21 21:00:11 +00:00
Eli Friedman	286c5a76ba	[SCEVAffinator] Make precise modular math more correct. Integer math in LLVM IR is modular. Integer math in isl is arbitrary-precision. Modeling LLVM IR math correctly in isl requires either adding assumptions that math doesn't actually overflow, or explicitly wrapping the math. However, expressions with the "nsw" flag are special; we can pretend they're arbitrary-precision because it's undefined behavior if the result wraps. SCEV expressions based on IR instructions with an nsw flag also carry an nsw flag (roughly; actually, the real rule is a bit more complicated, but the details don't matter here). Before this patch, SCEV flags were also overloaded with an additional function: the ZExt code was mutating SCEV expressions as a hack to indicate to checkForWrapping that we don't need to add assumptions to the operand of a ZExt; it'll add explicit wrapping itself. This kind of works... the problem is that if anything else ever touches that SCEV expression, it'll get confused by the incorrect flags. Instead, with this patch, we make the decision about whether to explicitly wrap the math a bit earlier, basing the decision purely on the SCEV expression itself, and not its users. Differential Revision: https://reviews.llvm.org/D25287 llvm-svn: 284848	2016-10-21 18:08:02 +00:00
Mandeep Singh Grang	48e7add80f	[polly] Change SmallPtrSet which are being iterated into SmallSetVector Summary: Otherwise the lack of an iteration order results in non-determinism in codegen. Reviewers: _jdoerfert, zinob, grosser Tags: #polly Differential Revision: https://reviews.llvm.org/D25863 llvm-svn: 284845	2016-10-21 17:29:10 +00:00
Michael Kruse	3d0266b664	[cmake] Avoid warnings in feature tests. NFC. Apply the __attribute__((unused)) before the function to unambiguously apply to the function declaration. Add more casts-to-void to mark return values unused as intended. Contributed-by: Andy Gibbs <andyg1001@hotmail.co.uk> llvm-svn: 284718	2016-10-20 11:16:19 +00:00
Tobias Grosser	69ed8542f5	Update isl to isl-0.17.1-236-ga9c6cc7 This includes isl_id_to_str, which is used in Michael's upcoming DeLICM patch. llvm-svn: 284689	2016-10-20 01:59:24 +00:00
Mandeep Singh Grang	5b1abfc88e	[polly] Fix non-determinism in polly BlockGenerators Summary: Iterating over SeenBlocks which is a SmallPtrSet results in non-determinism in codegen Reviewers: jdoerfert, zinob, grosser Tags: #polly Differential Revision: https://reviews.llvm.org/D25778 llvm-svn: 284622	2016-10-19 17:56:49 +00:00
Michael Kruse	6b87504973	[test] Fix buildbot after SCEV change. Update test after commit r284501: [SCEV] Make CompareValueComplexity a little bit smarter Contributed-by: Sanjoy Das <sanjoy@playingwithpointers.com> llvm-svn: 284543	2016-10-18 22:58:09 +00:00
Eli Friedman	3c1a75bf9c	Handle multi-dimensional invariant load. If the address of a load depends on another load, make sure to emit the loads in the right order. llvm-svn: 284426	2016-10-17 21:04:26 +00:00
Michael Kruse	6a19d592da	[ScopDetect] Depend transitively on ScalarEvolution. ScopDetection might be queried by -dot-scops or -view-scops passes for which it accesses ScalarEvolution. llvm-svn: 284385	2016-10-17 13:29:20 +00:00
Michael Kruse	2ddb279a39	[test] Add missing colon. llvm-svn: 284349	2016-10-16 22:05:51 +00:00
Michael Kruse	17d5090532	[cmake] Add polly-isl-test dependency to lit tests. Also handle the in-llvm-tree case forgotten in r284339. llvm-svn: 284347	2016-10-16 21:35:57 +00:00
Michael Kruse	8dee3427f7	[cmake] Add polly-isl-test dependency to lit tests. lit recursively iterates through the test subdirectories and finds the ISL unittest. For this test to work, the polly-isl-test executable needs to be compiled. Add the polly-isl-test dependency to POLLY_TEST_DEPS. This makes check-polly and check-polly-tests work from a fresh build directory. llvm-svn: 284339	2016-10-16 18:22:02 +00:00
Michael Kruse	f0c06900ed	[test] Add -polly-unprofitable-scalar-accs to test that needs it. The test non_affine_loop_used_later.ll also tests the profability heuristic. Add the option -polly-unprofitable-scalar-accs explicitely to ensure that the test succeeds if the default value is changed. llvm-svn: 284338	2016-10-16 18:13:01 +00:00
Tobias Grosser	4b5e24df9a	cmake: avoid "zero-length gnu_printf format string" warning in gcc 6.1.1 Contributed-by: Andy Gibbs <andyg1001@hotmail.co.uk> llvm-svn: 284302	2016-10-15 05:08:12 +00:00
Michael Kruse	fa53c86dc1	[ScopInfo/CodeGen] ExitPHI reads are implicit. Under some conditions MK_Value read accessed where converted to MK_ExitPHI read accessed. This is unexpected because MK_ExitPHI read accesses are implicit after the scop execution. This behaviour was introduced in r265261, which fixed a failed assertion/crash in CodeGen. Instead, we fix this failure in CodeGen itself. createExitPHINodeMerges(), despite its name, also handles accesses of kind MK_Value, only to skip them because they access values that are usually not PHI nodes in the SCoP region's exit block. Except in the situation observed in r265261. Do not convert value accessed to ExitPHI accesses and do not handle value accesses like ExitPHI accessed in CodeGen anymore. llvm-svn: 284023	2016-10-12 16:31:09 +00:00
Michael Kruse	c9edc2ee8d	[DepInfo] Print -debug output outside of max-operations scope. ISL tries to simplify the polyhedral operations before printing its objects. This increases the operations counter and therefore can contribute to hitting the operations limit. Therefore the result could be different when -debug output is enabled, making debugging harder. llvm-svn: 283745	2016-10-10 11:45:59 +00:00
Michael Kruse	8bfba1ff46	[Support/DepInfo] Introduce IslMaxOperationsGuard and make DepInfo use it. NFC. IslMaxOperationsGuard defines a scope where ISL may abort operations because if it takes too many operations. Replace the call to the raw ISL interface by a use of the guard. IslMaxOperationsGuard provides a uniform way to define a maximal computation time for a code region in C++ using RAII. llvm-svn: 283744	2016-10-10 11:45:54 +00:00
Tobias Grosser	b270288752	Fix formatting after recent cl:: changes This fixes 'make check-polly' llvm-svn: 283693	2016-10-09 08:31:35 +00:00
Mehdi Amini	732afdd09a	Turn cl::values() (for enum) from a vararg function to using C++ variadic template The core of the change is supposed to be NFC, however it also fixes what I believe was an undefined behavior when calling: va_start(ValueArgs, Desc); with Desc being a StringRef. Differential Revision: https://reviews.llvm.org/D25342 llvm-svn: 283671	2016-10-08 19:41:06 +00:00
Hongbin Zheng	5860aef675	Define PATH_MAX on windows Differential Revision: https://reviews.llvm.org/D25372 llvm-svn: 283600	2016-10-07 20:58:20 +00:00
Michael Kruse	de42b43de0	[cmake] Unify disabling upstream project warnings. Handle MSVC, ISL and PPCG in one place. The only functional change is that warnings are also disabled for MSVC compiling PPCG (Which currently fails anyway). llvm-svn: 283547	2016-10-07 12:38:32 +00:00
Michael Kruse	4b5f6af2dc	[cmake] Move isl_test artifacts to Polly folder. Folders in Visual Studio solutions help organize the build artifacts from all LLVM projects. There is a folder to keep Polly-built files in. llvm-svn: 283546	2016-10-07 12:38:24 +00:00
Tobias Grosser	e84ee850d1	Build and run isl_test as part of check-polly Running isl tests is important to gain confidence that the isl build we created works as expected. Besides the actual isl tests, there are also isl AST generation tests shipped with isl. This change only adds support for the isl unit tests. AST generation test support is left for a later commit. There is a choice to run tests directly through the build system or in the context of lit. We choose to run tests as part of lit to as this allows us to easily set environment variables, print output only on error and generally run the tests directly from the lit command. Reviewers: brad.king, Meinersbur Subscribers: modocache, brad.king, pollydev, beanz, llvm-commits, mgorny Differential Revision: https://reviews.llvm.org/D25155 llvm-svn: 283245	2016-10-04 19:48:40 +00:00
Michael Kruse	6ab4476835	[ScopInfo] Add -polly-unprofitable-scalar-accs option. With this option one can disable the heuristic that assumes that statements with a scalar write access cannot be profitably optimized. Such a statement instances necessarily have WAW-dependences to itself. With DeLICM scalar accesses can be changed to array accesses, which can avoid these WAW-dependence. llvm-svn: 283233	2016-10-04 17:33:39 +00:00
Michael Kruse	ca7cbcca37	[ScopInfo] Scalar access do not have indirect base pointers. ScopArrayInfo used to determine base pointer origins by looking up whether the base pointer is a load. The "base pointer" for scalar accesses is the llvm::Value being accessed. This is only a symbolic base pointer, it represents the alloca variable (.s2a or .phiops) generated for it at code generation. This patch disables determining base pointer origin for scalars. A test case where this caused a crash will be added in the next commit. In that test SAI tried to get the origin base pointer that was only declared later, therefore not existing. This is probably only possible for scalars used in PHINode incoming blocks. llvm-svn: 283232	2016-10-04 17:33:34 +00:00
Michael Kruse	9116899615	[ScopInfo] Make simplifySCoP() public. NFC. This function may need to be called after the scop construction. The upcoming DeLICM will use this to cleanup statement that all write accesses have been removed from. llvm-svn: 283221	2016-10-04 14:14:33 +00:00
Tobias Grosser	5652c9c830	isl: update to isl-0.17.1-233-gc911e6a llvm-svn: 283049	2016-10-01 19:46:51 +00:00
Michael Kruse	4b0c5aea78	[CodeGen] Add assertion for indirect array index expression generation. NFC. Currently Polly cannot generate code for index expressions if the base pointer is computed within the scop. The base pointer must be generated as well, but there is no code that triggers that. Add an assertion to detect when this would occur and miscompile. The IR verifier should catch it as well. llvm-svn: 282893	2016-09-30 18:29:37 +00:00
Michael Kruse	f6f795e9ae	[Support] Complete ISL annotations to IslPtr<>. NFC. Add missing __isl_(give/take/keep) annotations to IslPtr<> and NonowningIslPtr<> methods. Because IslPtr's constructor's annotation would depend on the TakeOwnership parameter, the parameter has been removed. Caller must copy the object themselves if the do not want to take ownership. llvm-svn: 282883	2016-09-30 17:47:39 +00:00
Michael Kruse	51f514d853	[Support] Compile fix for gcc. NFC. gcc 5.4 insists on template specialization to be in a namespace polly { ... } block, instead of being prefixed with 'polly::'. Error message: root/src/llvm/tools/polly/lib/Support/GICHelper.cpp:203:54: error: specialization of ‘template<class T> void polly::IslPtr<T>::dump() const’ in different namespace [-fpermissive] template <> void polly::IslPtr<isl_##TYPE>::dump() const { \ ^ msvc14 and clang 3.8 did not complain. llvm-svn: 282874	2016-09-30 16:47:43 +00:00
Michael Kruse	55519dad62	[Support] Add (Nonowning-)IslPtr::dump(). NFC. The dump() methods can be called from a debugger instead of e.g. isl_*_dump(Var.Obj) where Var is a variable of type IslPtr/NonowningIslPtr. To ensure that the existence of the function pointers do not depdend on whether the methods are used somwhere, they are declared with external linkage. llvm-svn: 282870	2016-09-30 16:10:19 +00:00
Michael Kruse	32312d0294	[Support] Call isl__free() only on non-null pointers. NFC. Add a non-NULL check before calling the free function into functions that are supposed to be inlined. First, this is a form of partial inlining of the free function, namely the nullptr test that free has to do. Secondly, and more importantly, it allows the compiler to remove the call to isl__free() when it knows that the object is nullptr, for instance because the last call is a take(). "Consuming" the last use of an ISL object using take() (instead of copy()) is a common pattern. llvm-svn: 282864	2016-09-30 15:29:43 +00:00
Michael Kruse	888ab55140	[CodeGen] Change 'Scalar' to 'Array' in method names. NFC. generateScalarLoad() and generateScalarStore() are used for explicit (MK_Array) memory accesses, therefore the method names were misleading. The names also were similar to generateScalarLoads() and generateScalarStores() (plural forms) which indeed handle scalar accesses. Presumbly, they were originally named to contrast VectorBlockGenerator::generateLoad(). Rename the two methods to generateArrayLoad(), respectively generateArrayStore(). llvm-svn: 282861	2016-09-30 14:34:05 +00:00
Michael Kruse	77394f1394	[CodeGen] Add assertion for partial scalar accesses. NFC. The code generator always adds unconditional LoadInst and StoreInst, hence the MemoryAccess must be defined over all statement instances. llvm-svn: 282853	2016-09-30 14:01:46 +00:00
Tobias Grosser	b110296011	www: Add Loopy publication llvm-svn: 282747	2016-09-29 18:17:30 +00:00
Tobias Grosser	ae91bc7353	www: add new code coverage link to Polly website llvm-svn: 282351	2016-09-25 08:03:38 +00:00
Tobias Grosser	349d1c3368	[ScopDetection] Remove redundant checks for endless loops Summary: Both `canUseISLTripCount()` and `addOverApproximatedRegion()` contained checks to reject endless loops which are now removed and replaced by a single check in `isValidLoop()`. For reporting such loops the `ReportLoopOverlapWithNonAffineSubRegion` is renamed to `ReportLoopHasNoExit`. The test case `ReportLoopOverlapWithNonAffineSubRegion.ll` is adapted and renamed as well. The schedule generation in `buildSchedule()` is based on the following assumption: Given some block B that is contained in a loop L and a SESE region R, we assume that L is contained in R or the other way around. However, this assumption is broken in the presence of endless loops that are nested inside other loops. Therefore, in order to prevent erroneous behavior in `buildSchedule()`, r265280 introduced a corresponding check in `canUseISLTripCount()` to reject endless loops. Unfortunately, it was possible to bypass this check with -polly-allow-nonaffine-loops which was fixed by adding another check to reject endless loops in `allowOverApproximatedRegion()` in r273905. Hence there existed two separate locations that handled this case. Thank you Johannes Doerfert for helping to provide the above background information. Reviewers: Meinersbur, grosser Subscribers: _jdoerfert, pollydev Differential Revision: https://reviews.llvm.org/D24560 Contributed-by: Matthias Reisinger <d412vv1n@gmail.com> llvm-svn: 281987	2016-09-20 17:05:22 +00:00
Tobias Grosser	122d6d74f6	Fix spelling in CMakeLists llvm-svn: 281897	2016-09-19 10:55:31 +00:00
Tobias Grosser	05ee64e67a	GPGPU: add missing REQUIRES line to test case llvm-svn: 281850	2016-09-18 08:57:38 +00:00
Tobias Grosser	bc653f2031	GPGPU: Do not run mostly sequential kernels in GPU In case sequential kernels are found deeper in the loop tree than any parallel kernel, the overall scop is probably mostly sequential. Hence, run it on the CPU. llvm-svn: 281849	2016-09-18 08:31:09 +00:00
Tobias Grosser	82f2af3508	GPGPU: Dynamically ensure 'sufficient compute' Offloading to a GPU is only beneficial if there is a sufficient amount of compute that can be accelerated. Many kernels just have a very small number of dynamic compute, which means GPU acceleration is not beneficial. We compute at run-time an approximation of how many dynamic instructions will be executed and fall back to CPU code in case this number is not sufficiently large. To keep the run-time checking code simple, we over-approximate the number of instructions executed in each statement by computing the volume of the rectangular hull of its iteration space. llvm-svn: 281848	2016-09-18 06:50:35 +00:00
Tobias Grosser	cfdee6582b	GPGPU: Make test cases independent of register numbering [NFC] llvm-svn: 281847	2016-09-18 06:50:28 +00:00
Tobias Grosser	51dfc27589	GPGPU: Store back non-read-only scalars We may generate GPU kernels that store into scalars in case we run some sequential code on the GPU because the remaining data is expected to already be on the GPU. For these kernels it is important to not keep the scalar values in thread-local registers, but to store them back to the corresponding device memory objects that backs them up. We currently only store scalars back at the end of a kernel. This is only correct if precisely one thread is executed. In case more than one thread may be run, we currently invalidate the scop. To support such cases correctly, we would need to always load and store back from a corresponding global memory slot instead of a thread-local alloca slot. llvm-svn: 281838	2016-09-17 19:22:31 +00:00
Tobias Grosser	fe74a7a1f5	GPGPU: Detect read-only scalar arrays ... and pass these by value rather than by reference. llvm-svn: 281837	2016-09-17 19:22:18 +00:00
Tobias Grosser	8f86a47461	Update CFGPrinter -> CFGPrinterLegacyPass .. to match recent changes in LLVM that broke the Polly compilation. llvm-svn: 281705	2016-09-16 05:48:09 +00:00
Tobias Grosser	aaabbbf886	GPGPU: Do not assume arrays start at 0 Our alias checks precisely check that the minimal and maximal accessed elements do not overlap in a kernel. Hence, we must ensure that our host <-> device transfers do not touch additional memory locations that are not covered in the alias check. To ensure this, we make sure that the data we copy for a given array is only the data from the smallest element accessed to the largest element accessed. We also adjust the size of the array according to the offset at which the array is actually accessed. An interesting result of this is: In case array are accessed with negative subscripts ,e.g., A[-100], we automatically allocate and transfer _more_ data to cover the full array. This is important as such code indeed exists in the wild. llvm-svn: 281611	2016-09-15 14:05:58 +00:00
Roman Gareev	b3224adfb6	Perform copying to created arrays according to the packing transformation This is the fourth patch to apply the BLIS matmul optimization pattern on matmul kernels (http://www.cs.utexas.edu/users/flame/pubs/TOMS-BLIS-Analytical.pdf). BLIS implements gemm as three nested loops around a macro-kernel, plus two packing routines. The macro-kernel is implemented in terms of two additional loops around a micro-kernel. The micro-kernel is a loop around a rank-1 (i.e., outer product) update. In this change we perform copying to created arrays, which is the last step to implement the packing transformation. Reviewed-by: Tobias Grosser <tobias@grosser.es> Differential Revision: https://reviews.llvm.org/D23260 llvm-svn: 281441	2016-09-14 06:26:09 +00:00
Tobias Grosser	e8c69bbabd	cmake: PollyPPCG depends on PollyISL This line makes BUILD_SHARED_LIBS=ON work for Polly-ACC. Without it, ld complains about missing isl symbols when constructing the shared library. llvm-svn: 281396	2016-09-13 21:09:35 +00:00
Tobias Grosser	0a893f7df4	GPGPU: Use const_cast to avoid compiler warning [NFC] llvm-svn: 281333	2016-09-13 13:22:27 +00:00
Michael Kruse	19c9d99f45	Use value directly instead of reference. NFC. The alias to the array element is read-only and a primitive type (pointer), therefore use the value directly instead of a reference to it. llvm-svn: 281311	2016-09-13 09:56:05 +00:00
Tobias Grosser	a82c4b5df8	GPGPU: Allow region statements llvm-svn: 281305	2016-09-13 08:42:10 +00:00
Tobias Grosser	b79f4d3970	GPGPU: Extend types when array sizes have smaller types This prevents a compiler crash. llvm-svn: 281303	2016-09-13 08:02:14 +00:00
Tobias Grosser	b51d507c74	Adapt test case to recent change in Global Variable Definition llvm-svn: 281295	2016-09-13 05:19:26 +00:00
Michael Kruse	e5e752a28b	Remove -fvisibility=hidden and FORCE_STATIC. The flag -fvisibility=hidden flag was used for the integrated Integer Set Library (and PPCG) to keep their definitions local to Polly. The motivation was the be loaded into a DragonEgg-powered GCC, where GCC might itself use ISL for its Graphite extension. The symbols of Polly's ISL and GCC's ISL would clash. The DragonEgg project is not actively developed anymore, but Polly's unittests need to call ISL functions to set up a testing environment. Unfortunately, the -fvisibility=hidden flag means that the ISL symbols are not available to the gtest executable as it resides outside of libPolly when linked dynamically. Currently, CMake links a second copy of ISL into the unittests which leads to subtle bugs. What got observed is that two isl_ids for isl_id_none exist, one for each library instance. Because isl_id's are compared by address, isl_id_none could happen to be different from isl_id_none, depending on which library instance set the address and does the comparison. Also remove the FORCE_STATIC flag which was introduced to keep the ISL symbols visible inside the same libPolly shared object, even when build with BUILD_SHARED_LIBS. Differential Revision: https://reviews.llvm.org/D24460 llvm-svn: 281242	2016-09-12 18:25:00 +00:00
Roman Gareev	f5aff70405	Store the size of the outermost dimension in case of newly created arrays that require memory allocation. We do not need the size of the outermost dimension in most cases, but if we allocate memory for newly created arrays, that size is needed. Reviewed-by: Michael Kruse <llvm@meinersbur.de> Differential Revision: https://reviews.llvm.org/D23991 llvm-svn: 281234	2016-09-12 17:08:31 +00:00
Tobias Grosser	5857b701a3	GPGPU: Bail out gracefully in case of invalid IR Instead of aborting, we now bail out gracefully in case the kernel IR we generate is invalid. This can currently happen in case the SCoP stores pointer values, which we model as arrays, as data values into other arrays. In this case, the original pointer value is not available on the device and can consequently not be stored. As detecting this ahead of time is not so easy, we detect these situations after the invalid IR has been generated and bail out. llvm-svn: 281193	2016-09-12 06:06:31 +00:00
Tobias Grosser	0bf4cc6499	Add missing 'REQUIRES' line llvm-svn: 281166	2016-09-11 13:42:42 +00:00
Tobias Grosser	02293ed755	GPGPU: Do not fail in case of arrays never accessed If these arrays have never been accessed we failed to derive an upper bound of the accesses and consequently a size for the outermost dimension. We now explicitly check for empty access sets and then just use zero as size for the outermost dimension. llvm-svn: 281165	2016-09-11 13:30:12 +00:00
Tobias Grosser	89fda2bde2	GPURuntime: ensure compilation with C99 Otherwise, older compiler will error out on some of the C99 features we use. llvm-svn: 281159	2016-09-11 07:32:50 +00:00
Tobias Grosser	5aea5653b3	FlattenAlgo: Ensure we _really_ obtain a param space This resolves "isl_space.c:1775: not a parameter space" errors I have seen on two systems. llvm-svn: 281052	2016-09-09 16:11:26 +00:00
Tobias Grosser	a6987a4ddd	Add namespace specifier before nullptr_t This fixes the following compile time errors: error: unknown type name 'nullptr_t'; did you mean 'std::nullptr_t' llvm-svn: 281039	2016-09-09 12:31:38 +00:00
Tobias Grosser	a3afe44d6c	IslNodeBuilder: Add missing __isl_take annotation llvm-svn: 281034	2016-09-09 11:16:50 +00:00
Michael Kruse	7886bd7ca5	Add -polly-flatten-schedule pass. The -polly-flatten-schedule pass reduces the number of scattering dimensions in its isl_union_map form to make them easier to understand. It is not meant to be used in production, only for debugging and regression tests. To illustrate, how it can make sets simpler, here is a lifetime set used computed by the porposed DeLICM pass without flattening: { Stmt_reduction_for[0, 4] -> [0, 2, o2, o3] : o2 < 0; Stmt_reduction_for[0, 4] -> [0, 1, o2, o3] : o2 >= 5; Stmt_reduction_for[0, 4] -> [0, 1, 4, o3] : o3 > 0; Stmt_reduction_for[0, i1] -> [0, 1, i1, 1] : 0 <= i1 <= 3; Stmt_reduction_for[0, 4] -> [0, 2, 0, o3] : o3 <= 0 } And here the same lifetime for a semantically identical one-dimensional schedule: { Stmt_reduction_for[0, i1] -> [2 + 3i1] : 0 <= i1 <= 4 } Differential Revision: https://reviews.llvm.org/D24310 llvm-svn: 280948	2016-09-08 15:02:36 +00:00
Tobias Grosser	a2d80ba58a	GICHelper: Correctly assign return value ... to preserve reference counting logic. In practice the missing assignment would not have caused any issues. We still fix it as the code is wrong and it also causes noise in the clang static analysis runs. llvm-svn: 280946	2016-09-08 14:34:54 +00:00
Tobias Grosser	b27ed0da37	SCEVAffinator: Add missing __isl_take annotations llvm-svn: 280943	2016-09-08 14:31:31 +00:00
Tobias Grosser	55a7af7da5	ScopInfo: Make clear that no double-free problem exists When running the clang static analyser to check for memory issues, this code originally showed a double free, as the analyser was unable to understand that isl_set_free always returns NULL and consequently later uses of the isl object we just freed will never be reached. Without this knowledge, the analyser has to issue a warning. We refactor the code to make it clear that for empty maps the current loop iteration is aborted. llvm-svn: 280940	2016-09-08 14:08:07 +00:00
Tobias Grosser	b316dc166f	ScopDetection: Make sure we do not accidentally divide by zero This code path is likely never triggered, but by still handling this case locally we avoid warnings in clangs static analyzer. llvm-svn: 280939	2016-09-08 14:08:05 +00:00
Tobias Grosser	adfc971820	DependenceInfo: Make clear that no double-free problem exists When running the clang static analyser to check for memory issues, this code originally showed a double free, as the analyser was unable to understand that isl_union_map_free always returns NULL and consequently later uses of the isl object we just freed will never be reached. Without this knowledge, the analyser has to issue a warning. We refactor the code to make it clear that for empty maps the current loop iteration is aborted. llvm-svn: 280938	2016-09-08 14:08:01 +00:00
Tobias Grosser	f3600dfa2d	IslNodeBuilder: Add missing __isl_take annotations llvm-svn: 280936	2016-09-08 13:48:55 +00:00
Tobias Grosser	2a526feec9	ScopInfo: Add missing __isl_take annotation llvm-svn: 280923	2016-09-08 11:18:56 +00:00
Michael Kruse	349779cc99	Disable MSVC warnings on ISL. Disable some Visual C++ warnings on ISL. These are not reported by GCC/Clang in the ISL build system. We do not intend to fix them in the Polly in-tree copy, hence disable these warnings. llvm-svn: 280811	2016-09-07 14:11:20 +00:00
Michael Kruse	564579726a	Add check-polly-tests build target. The check-polly-tests target runs regression/unit tests but without checking formatting. This is useful to not having to reload a file in an open editor (which eg. clears the undo buffer, moves cursor/window position) when running polly-update-format. After this change, the following test targets exist: - check-polly-unittests to run unittests only - check-polly-tests to run unit and regression tests - polly-check-format to check formatting using clang-format - check-polly to run them all As a side-effect, when running check-polly, polly-check-format and run in parallel (instead of polly-check-format first). Differential Revision: https://reviews.llvm.org/D24191 llvm-svn: 280654	2016-09-05 10:54:16 +00:00
Tobias Grosser	8d4cb1a060	ScopInfo: Do not derive assumptions from all GEP pointer instructions ... but instead rely on the assumptions that we derive for load/store instructions. Before we were able to delinearize arrays, we used GEP pointer instructions to derive information about the likely range of induction variables, which gave us more freedom during loop scheduling. Today, this is not needed any more as we delinearize multi-dimensional memory accesses and as part of this process also "assume" that all accesses to these arrays remain inbounds. The old derive-assumptions-from-GEP code has consequently become mostly redundant. We drop it both to clean up our code, but also to improve compile time. This change reduces the scop construction time for 3mm in no-asserts mode on my machine from 48 to 37 ms. llvm-svn: 280601	2016-09-03 21:55:25 +00:00
Tobias Grosser	66c6506aac	Dependences: Only create flat StmtSchedule in presence of reductions Without reductions we do not need a flat union_map schedule describing the computation we want to perform, but can work purely on the schedule tree. This reduces the dependence computation and scheduling time from 33ms to 25ms. Another 30% reduction. llvm-svn: 280558	2016-09-02 23:40:15 +00:00
Tobias Grosser	dff5de2e44	Dependences: Exit early, if no reduction dependences are needed. In case we do not compute reduction dependences or dependences that are more fine-grained than statement level dependences, we can avoid the corresponding part of the dependence analysis all together. For the 3mm benchmark, this reduces scheduling + dependence analysis time from 62ms to 33ms for a no-asserts build. The majority of the compile time is anyhow spent in the LLVM backends, when doing code generation. Nevertheless, there is no need to waste compile time either. llvm-svn: 280557	2016-09-02 23:29:38 +00:00
Tobias Grosser	b1000c39a0	Introduce option to run isl AST generation, but no IR generation. We replace the options -polly-code-generator=none =isl with the options -polly-code-generation=none =ast =full This allows us to measure the overhead of Polly itself, versus the compile time increases due to us generating more IR and consequently the LLVM backends spending more time on this IR. We also use this opportunity to rename the option. The original name was introduced at a point where we still had two code generators. CLooG and the isl AST generator. Since we only have one AST generator left, there is no need to distinguish between 'isl' and something else. However, being able to disable code generation all together has been shown useful for debugging. Hence, we rename and extend this option to make it a good fit for its new use case. llvm-svn: 280554	2016-09-02 23:05:42 +00:00
Tobias Grosser	c80d6979bd	Drop '@brief' from doxygen comments LLVM's coding guideline suggests to not use @brief for one-sentence doxygen comments to improve readability. Switch this once and for all to ensure people do not copy @brief comments from other parts of Polly, when writing new code. llvm-svn: 280468	2016-09-02 06:33:33 +00:00
Michael Kruse	2fa3519463	Allow mapping scalar MemoryAccesses to array elements. Change the code around setNewAccessRelation to allow to use a an existing array element for memory instead of an ad-hoc alloca. This facility will be used for DeLICM/DeGVN to convert scalar dependencies into regular ones. The changes necessary include: - Make the code generator use the implicit locations instead of the alloca ones. - A test case - Make the JScop importer accept changes of scalar accesses for that test case. - Adapt the MemoryAccess interface to the fact that the MemoryKind can change. They are named (get\|is)OriginalXXX() to get the status of the memory access before any change by setNewAccessRelation() (some properties such as getIncoming() do not change even if the kind is changed and are still required). To get the modified properties, there is (get\|is)LatestXXX(). The old accessors without Original\|Latest become synonyms of the (get\|is)OriginalXXX() to not make functional changes in unrelated code. Differential Revision: https://reviews.llvm.org/D23962 llvm-svn: 280408	2016-09-01 19:53:31 +00:00
Michael Kruse	772ce72000	Check validity of new access relations. NFC. There are some constraints on maps that can be access relations. In builds with assertions enabled, verify - The access domain is the same space as the statement's domain (modulo parameters). - Whether an access is defined for every instance of the statement. (codegen does not yet support partial access relations) - Whether the access range links to an array, represented by a ScopArrayInfo. - The number of access dimensions equals the dimensions of the array. - The array is not an indirect access. (also not supported by codegen) Differential Revision: https://reviews.llvm.org/D23916 llvm-svn: 280404	2016-09-01 19:16:58 +00:00
Michael Kruse	d56b90a967	[ScopInfo] Add missing ISL annotations NFC. llvm-svn: 280343	2016-09-01 09:03:27 +00:00
Michael Kruse	77564f92e8	Update ISL to isl-0.17.1-203-g3fef898. This version has isl_space_has_equal_tuples added to the public API. llvm-svn: 280341	2016-09-01 08:26:22 +00:00
Tobias Grosser	90a3c0ba99	Add forgotten image llvm-svn: 280083	2016-08-30 12:41:29 +00:00
Tobias Grosser	cb8f813254	www: homepage "Overview and News" llvm-svn: 280082	2016-08-30 12:41:08 +00:00
Tobias Grosser	0bb9c4b09a	www: shorten links in menu llvm-svn: 280081	2016-08-30 12:41:04 +00:00
Tobias Grosser	e1889f186d	www: link to github source mirror, drop the other old source viewers llvm-svn: 280080	2016-08-30 12:41:02 +00:00
Tobias Grosser	027d2f7bfd	www: improve formatting of external links llvm-svn: 280079	2016-08-30 12:40:59 +00:00
Tobias Grosser	e5721d659c	www: Add links to Polly Labs and Polyhedral.info llvm-svn: 280076	2016-08-30 12:08:25 +00:00
Tobias Grosser	80a9579db9	www: Add IMPACT 2017 announcement to news page llvm-svn: 280060	2016-08-30 06:35:17 +00:00
Michael Kruse	d262feff80	Add space between access string and follow-up. llvm-svn: 279826	2016-08-26 15:43:52 +00:00
Michael Kruse	6b6e38d9b1	Add "New access function" to update_check.py classifier. Lines with this prefix are printed by JSONImporter. llvm-svn: 279825	2016-08-26 15:43:43 +00:00
Tobias Grosser	9d61e4a980	Avoid the use of large unsigned values in isl unit test isl_val_int_from_ui takes an 'unsigned long' which has on 32-bit and LLP64 windows systems only 32 bit. Hence, make sure we do not use it with constants that are larger than 32 bit. Reported-by: Michael Kruse <llvm@meinersbur.de> llvm-svn: 279824	2016-08-26 15:42:38 +00:00
Roman Gareev	44aeef7ecf	[FIX] Access dimensions should correspond to number of dimensions of the accesses array. llvm-svn: 279821	2016-08-26 13:41:53 +00:00
Tobias Grosser	aa6eb5cd80	unittests: Make the expected value the first argument in EXPECT_EQ [NFC] This improves the readability of failing test results, as gtest prints always the first argument as the 'expected value'. In the previous commit we already changed the tests for isl_valFromAPInt. In this commit, the tests for IslValToAPInt follow. Suggested-by: Michael Kruse <llvm@meinersbur.de> llvm-svn: 279817	2016-08-26 12:25:08 +00:00
Tobias Grosser	437200089d	Improve documentation and testing for isl_valFromAPInt The recent unit tests we gained made clear that the semantics of isl_valFromAPInt are not clear, due to missing documentation. In this change we document both the calling interface as well as the implementation of isl_valFromAPInt. We also make the implementation easier to read by removing integer wrappig in abs() when passing in the minimal integer value for a given bitwidth. Even though wrapping and subsequently interpreting the result as unsigned value gives the correct result, this is far from obvious. Instead, we explicitly add one more bit to the input type to ensure that abs will never wrap. This change did not uncover a bug in the old implementation, but was introduced to increase readability. We update the tests to add a test case for this special case and use this opportunity to also test a number larger than 64 bit. Finally, we order the arguments of the test cases to make sure the expected output is first. This helps readability in case of failing test cases as gtest assumes the first value to be the exected value. Reviewed-by: Michael Kruse <llvm@meinersbur.de> Differential Revision: https://reviews.llvm.org/D23917 llvm-svn: 279815	2016-08-26 12:01:07 +00:00
Tobias Grosser	76f8279e44	Improve documentation and testing of APIntFromVal The recent unit tests we gained made clear that the semantics of APIntFromVal are not clear, due to missing documentation. In this change we document both the calling interface as well as the implementation of APIntFromVal. We also make the implementation easier to read by removing the use of magic numbers. Finally, we add tests to check the bitwidth of the created values as well as the correct modeling of very large numbers. Reviewed-by: Michael Kruse <llvm@meinersbur.de> Differential Revision: https://reviews.llvm.org/D23910 llvm-svn: 279813	2016-08-26 10:43:28 +00:00
Michael Kruse	b41b990e05	Query llvm-config to get system libs required for linking. Remove the unused function get_system_libs. Instead, run 'llvm-config --system-libs' to determine which libraries are required in addition LLVM's for linking an executable. At the moment these are the unittests that link to gtest and transitively depend on these system libs. llvm-svn: 279743	2016-08-25 14:58:29 +00:00
Michael Kruse	606b06a156	Add comment for querying --libdir. NFC. llvm-svn: 279742	2016-08-25 14:43:04 +00:00
Michael Kruse	fd4a332a3a	Do not build unittests by default. Only build them for check-polly and check-polly-unittests in out-of-tree builds. In LLVM, this behaviour is controlled with LLVM_BUILD_TESTS. Polly out-of-tree does not have such a flag. llvm-svn: 279740	2016-08-25 14:33:44 +00:00
Michael Kruse	b6eadcc5cc	Add missing words to wanrning. llvm-svn: 279738	2016-08-25 13:29:26 +00:00
Michael Kruse	225d583825	Add warning for FORCE_STATIC libraries when using BUILD_SHARED_LIBS. We cannot built ISL as shared object because we build it with -fvisibility=hidden; The created shared object would have no accessible symbols. The reason it is built with -fvisibility=hidden is because opt/clang might load other libraries that have ISL embedded and whose' symbols would conflict with Polly's private ISL. This could happend with Draggonegg. In the future we might instead statically link PollyISL into the Polly shared object to avoid installing the static library. Requested-by: Vedran Miletic <vedran@miletic.net> See-also: llvm.org/PR27306 llvm-svn: 279737	2016-08-25 13:21:53 +00:00
Michael Kruse	05cf9c22f1	Introduce unittests. Add the infrastructure for unittests to Polly and two simple tests for conversion between isl_val and APInt. In addition, a build target check-polly-unittests is added to run only the unittests but not the regression tests. Clang's unittest mechanism served as as a blueprint which then was adapted to Polly. Differential Revision: https://reviews.llvm.org/D23833 llvm-svn: 279734	2016-08-25 12:36:15 +00:00
Michael Kruse	0e63ab4243	Use configure_lit_site_cfg instead of configure_file. configure_lit_site_cfg defines some more parameters that are used in lit.site.cfg.in. configure_file would leave those empty. These additional definitions seem to be unimportant for regression tests, but unittests do not work without them. In case of out-of-tree builds, define the additional parameters with default values. These may not take all configuration parameters into account, as configure_lit_site_cfg would. llvm-svn: 279733	2016-08-25 12:03:33 +00:00
Michael Kruse	17a8b791ae	Add LLVM libdir to library search path in out-of-tree builds. This previously was not required because in an out-of-tree build Polly would only build libraries (LLVMPolly, libPolly, libPollyISL, libPollyPPCG), but no executables where the libraries would be linked to. This will change when adding unittests in a follow-up commit. llvm-svn: 279730	2016-08-25 11:28:52 +00:00
Michael Kruse	941a692690	Also warn if llvm-lit is not available. The program 'llvm-lit', like 'not' and 'FileCheck' are necessary for running check-polly. Warn of any of the three is not in LLVM_INSTALL_ROOT/bin directory. llvm-svn: 279728	2016-08-25 10:35:22 +00:00
Michael Kruse	4a080de057	Add %loadPolly to test command line. Required for out-of-tree builds of Polly. llvm-svn: 279657	2016-08-24 19:12:48 +00:00
Tim Shen	12921aaa7b	Migrate from NodeType * to NodeRef. llvm-svn: 279488	2016-08-22 22:30:27 +00:00
Roman Gareev	5f99f8656e	Add a flag to dump SCoP optimized with the IslScheduleOptimizer pass Dump polyhedral descriptions of Scops optimized with the isl scheduling optimizer and the set of post-scheduling transformations applied on the schedule tree to be able to check the work of the IslScheduleOptimizer pass at the polyhedral level. Reviewed-by: Tobias Grosser <tobias@grosser.es> Differential Revision: https://reviews.llvm.org/D23740 llvm-svn: 279395	2016-08-21 11:20:39 +00:00
Roman Gareev	e2ee79afde	Simplify AccFuncMap to vector<> AccessFunctions getAccessFunctions() is dead code and the 'BB' argument of getOrCreateAccessFunctions() is not used. This patch deletes getAccessFunctions and transforms AccFuncMap into a std::vector<std::unique_ptr<MemoryAccess>> AccessFunctions. Reviewed-by: Tobias Grosser <tobias@grosser.es> Differential Revision: https://reviews.llvm.org/D23759 llvm-svn: 279394	2016-08-21 11:09:19 +00:00
Eli Friedman	28671c83d6	[SCEVValidator] Don't reorder multiplies in extractConstantFactor. The existing code would add the operands in the wrong order, and eventually crash because the SCEV expression doesn't exactly match the parameter SCEV expression in SCEVAffinator::visit. (SCEV doesn't sort the operands to getMulExpr in general.) Differential Revision: https://reviews.llvm.org/D23592 llvm-svn: 279087	2016-08-18 16:30:42 +00:00
Tobias Grosser	1c18440958	[BlockGenerator] Invalidate SCEV values for instructions in scop We already invalidated a couple of critical values earlier on, but we now invalidate all instructions contained in a scop after the scop has been code generated. This is necessary as later scops may otherwise obtain SCEV expressions that reference values in the earlier scop that before dominated the later scop, but which had been moved into the conditional branch and consequently do not dominate the later scop any more. If these very values are then used during code generation of the later scop, we generate used that are dominated by the values they use. This fixes: http://llvm.org/PR28984 llvm-svn: 279047	2016-08-18 10:45:57 +00:00
Michael Kruse	ffb3278e27	Update ISL to isl-0.17.1-200-gd8de4ea. This version fixes a bug in set coalescing. llvm-svn: 278936	2016-08-17 15:24:45 +00:00
Tobias Grosser	b143e31164	[ScopInfo] Make scalars used by PHIs in non-affine regions available Normally this is ensured when adding PHI nodes, but as PHI node dependences do not need to be added in case all incoming blocks are within the same non-affine region, this was missed. This corrects an issue visible in LNT's sqlite3, in case invariant load hoisting was disabled. llvm-svn: 278792	2016-08-16 11:44:48 +00:00
Tobias Grosser	c80c15bd50	[ScopDetect] Do not assert in case of AddRecs with non-constant start expression llvm-svn: 278738	2016-08-15 20:59:30 +00:00
Tobias Grosser	74814e1a07	Disable invariant load hoisting temporarily With invariant load hoisting enabled the LLVM buildbots currently show some miscompiles, which are possibly caused by invariant load hosting itself. Confirming and fixing this requires a more in-depth analysis. To meanwhile get back green buildbots that allow us to observe other regressions, we disable invariant code hoisting temporarily. The relevant bug is tracked at: http://llvm.org/PR28985 llvm-svn: 278681	2016-08-15 16:43:36 +00:00
Tobias Grosser	13e55a32fd	[test] Force invariant load hoisting one last time Without invariant load hoisting an (unrelated) bug is exposed in this test case: http://llvm.org/PR28984 llvm-svn: 278680	2016-08-15 16:43:33 +00:00
Tobias Grosser	7cb809983d	[tests] Force invariant load hoisting for test cases that need it -- III llvm-svn: 278673	2016-08-15 15:56:24 +00:00
Tobias Grosser	ad61c170d5	[tests] Force invariant load hoisting for test cases that need it II llvm-svn: 278669	2016-08-15 13:58:16 +00:00
Tobias Grosser	75b9c7df4d	[test] Correct spelling in test case and explicitly enable invariant load hoisting for this test case. llvm-svn: 278668	2016-08-15 13:58:04 +00:00
Tobias Grosser	6e6264c142	[tests] Force invariant load hoisting for test cases that need it This will make it easier to switch the default of Polly's invariant load hoisting strategy and also makes it very clear that these test cases indeed require invariant code hoisting to work. llvm-svn: 278667	2016-08-15 13:27:49 +00:00
Roman Gareev	1c892e91e3	Perform replacement of access relations and creation of new arrays according to the packing transformation This is the third patch to apply the BLIS matmul optimization pattern on matmul kernels (http://www.cs.utexas.edu/users/flame/pubs/TOMS-BLIS-Analytical.pdf). BLIS implements gemm as three nested loops around a macro-kernel, plus two packing routines. The macro-kernel is implemented in terms of two additional loops around a micro-kernel. The micro-kernel is a loop around a rank-1 (i.e., outer product) update. In this change we perform replacement of the access relations and create empty arrays, which are steps to implement the packing transformation. In subsequent changes we will implement copying to created arrays. Reviewed-by: Tobias Grosser <tobias@grosser.es> Differential Revision: http://reviews.llvm.org/D22187 llvm-svn: 278666	2016-08-15 12:22:54 +00:00
Tobias Grosser	d58acf866a	[GPGPU] Ensure arrays where only parts are modified are copied to GPU To do so we change the way array exents are computed. Instead of the precise set of memory locations accessed, we now compute the extent as the range between minimal and maximal address in the first dimension and the full extent defined by the sizes of the inner array dimensions. We also move the computation of the may_persist region after the construction of the arrays, as it relies on array information. Without arrays being constructed no useful information is computed at all. llvm-svn: 278212	2016-08-10 10:58:19 +00:00
Mandeep Singh Grang	8c5314479e	Fix spacing around variable initializations and for-loops. NFC. Reviewers: grosser, _jdoerfert, zinob Projects: #polly Differential Revision: https://reviews.llvm.org/D23285 llvm-svn: 278143	2016-08-09 17:49:24 +00:00
Tobias Grosser	b06ff4574e	[GPGPU] Support PHI nodes used in GPU kernel Ensure the right scalar allocations are used as the host location of data transfers. For the device code, we clear the allocation cache before device code generation to be able to generate new device-specific allocation and we need to make sure to add back the old host allocations as soon as the device code generation is finished. llvm-svn: 278126	2016-08-09 15:35:06 +00:00
Tobias Grosser	750160e260	[GPGPU] Use separate basic block for GPU initialization code This increases the readability of the IR and also clarifies that the GPU inititialization is executed _after_ the scalar initialization which needs to before the code of the transformed scop is executed. Besides increased readability, the IR should not change. Specifically, I do not expect any changes in program semantics due to this patch. llvm-svn: 278125	2016-08-09 15:35:03 +00:00
Tobias Grosser	776700d0b7	[BlockGenerator] Insert initializations at beginning of start block In case some code -- not guarded by control flow -- would be emitted directly in the start block, it may happen that this code would use uninitalized scalar values if the scalar initialization is only emitted at the end of the start block. This is not a problem today in normal Polly, as all statements are emitted in their own basic blocks, but Polly-ACC emits host-to-device copy statements into the start block. Additional Polly-ACC test coverage will be added in subsequent changes that improve the handling of PHI nodes in Polly-ACC. llvm-svn: 278124	2016-08-09 15:34:59 +00:00
Tobias Grosser	77f76788dc	[tests] Add two missing 'REQUIRES' lines llvm-svn: 278104	2016-08-09 09:11:39 +00:00
Tobias Grosser	c59b3ce044	[BlockGenerator] Also eliminate dead code not originating from BB After having generated the code for a ScopStmt, we run a simple dead-code elimination that drops all instructions that are known to be and remain unused. Until this change, we only considered instructions for dead-code elimination, if they have a corresponding instruction in the original BB that belongs to ScopStmt. However, when generating code we do not only copy code from the BB belonging to a ScopStmt, but also generate code for operands referenced from BB. After this change, we now also considers code for dead code elimination, which does not have a corresponding instruction in BB. This fixes a bug in Polly-ACC where such dead-code referenced CPU code from within a GPU kernel, which is possible as we do not guarantee that all variables that are used in known-dead-code are moved to the GPU. llvm-svn: 278103	2016-08-09 08:59:05 +00:00
Tobias Grosser	cf66ef26f3	[GPGPU] Pass parameters always by using their own type llvm-svn: 278100	2016-08-09 07:22:08 +00:00
Michael Kruse	a6cc0d3a2d	[ScopDetection] Remove unused DetectionContexts during expansion. The function expandRegion() frees Region* objects again when it determines that these are not valid SCoPs. However, the DetectionContext added to the DetectionContextMap still holds a reference. The validity is checked using the ValidRegions lookup table. When a new Region is added to that list, it might share the same address, such that the DetectionContext contains two Region* associations that are in ValidRegions, but that are unrelated and of which one has already been free. Also remove the DetectionContext when not a valid expansion. llvm-svn: 278062	2016-08-08 22:39:32 +00:00
Tobias Grosser	124534038a	[GPGPU] Support Values referenced from both isl expr and llvm instructions When adding code that avoids to pass values used in isl expressions and LLVM instructions twice, we forgot to make single variable passed to the kernel available in the ValueMap that makes it usable for instructions that are not replaced with isl ast expressions. This change adds the variable that is passed to the kernel to the ValueMap to ensure it is available for such use cases as well. llvm-svn: 278039	2016-08-08 19:22:19 +00:00
Tobias Grosser	cb1aef8de4	[GPGPU] Create code to verify run-time conditions llvm-svn: 278026	2016-08-08 17:35:55 +00:00
Tobias Grosser	fa9abd1f03	Fix compilation in 'asserts' mode llvm-svn: 278025	2016-08-08 17:35:52 +00:00
Tobias Grosser	0aa29532b7	[IslNodeBuilder] Move run-time check generation to NodeBuilder [NFC] This improves the structure of the code and allows us to reuse the runtime code generation in the PPCGCodeGeneration. llvm-svn: 278017	2016-08-08 15:41:52 +00:00
Tobias Grosser	219feac456	[CodeGeneration] Do not set insert position redundantly There is no need to reset the position of the builder, as we can just continue to insert code at the current position of the IRBuilder, which happens to be precisely the location we reset the builder to. llvm-svn: 278014	2016-08-08 15:25:50 +00:00
Tobias Grosser	000db70754	[IslNodeBuilder] Directly use the insert location of our Builder ... instead of adding instructions at the end of the basic block the builder is currently at. This makes it easier to reason about where IR is generated, as with the IRBuilder there is just a single location that specificies where IR is generated. llvm-svn: 278013	2016-08-08 15:25:46 +00:00
Michael Kruse	fbde435517	[CodeGen] Use MapVector instead of DenseMap. The map is iterated over when generating the values escaping the SCoP. The indeterministic iteration order of DenseMap causes the output IR to change at every compilation, adding noise to comparisons. Replace DenseMap by a MapVector to ensure the same iteration order at every compilation. llvm-svn: 277832	2016-08-05 16:45:51 +00:00
Michael Kruse	d82222fc1b	[DependenceInfo] Reset operations counter when setting limit. When entering the dependence computation and the max_operations is set, the operations counter may have already exceeded the counter, thus aborting any ISL computation from the start. The counter is reset at the end of the dependence calculation such that a follow-up recomputation might succeed, ie. the success of the first dependence calculation depends on unrelated ISL operations that happened before, giving it a disadvantage to the following calculations. This patch resets the operations counter at the beginning of the dependence recalculation to not depend on previous actions. Otherwise additional preprocessing of the Scop that aims to improve its schedulability (eg. DeLICM) do have the effect that DependenceInfo and hence the scheduling fail more likely, contraproductive to the goal of said preprocessing. llvm-svn: 277810	2016-08-05 11:31:02 +00:00
Tobias Grosser	928d7573dd	GPGPU: Sort dimension sizes of multi-dimensional shared memory arrays correctly Before this commit we generated the array type in reverse order and we also added the outermost dimension size to the new array declaration, which is incorrect as Polly additionally assumed an additional unsized outermost dimension, such that we had an off-by-one error in the linearization of access expressions. llvm-svn: 277802	2016-08-05 08:27:24 +00:00
Tobias Grosser	470608e3e4	Add missing 'REQUIRES' line llvm-svn: 277800	2016-08-05 07:08:45 +00:00
Tobias Grosser	c1c6a2a61b	GPGPU: Add cuda annotations to specify maximal number of threads per block These annotations ensure that the NVIDIA PTX assembler limits the number of registers used such that we can be certain the resulting kernel can be executed for the number of threads in a thread block that we are planning to use. llvm-svn: 277799	2016-08-05 06:47:43 +00:00
Tobias Grosser	f919d8b360	GPGPU: Support scalars that are mapped to shared memory llvm-svn: 277726	2016-08-04 13:57:29 +00:00
Tobias Grosser	8950cead7f	GPGPU: Disable verbose debug output llvm-svn: 277724	2016-08-04 12:44:03 +00:00
Tobias Grosser	b0dd95bcd2	Remove leftover debug output llvm-svn: 277723	2016-08-04 12:41:28 +00:00
Tobias Grosser	130ca30f92	GPGPU: Add private memory support llvm-svn: 277722	2016-08-04 12:39:03 +00:00
Tobias Grosser	b513b4916b	GPGPU: Add support for shared memory llvm-svn: 277721	2016-08-04 12:18:14 +00:00
Tobias Grosser	b187515784	GPGPU: Cache PTX kernels We always keep a number of already compiled kernels available to ensure to avoid costly recompilation. llvm-svn: 277707	2016-08-04 09:15:58 +00:00
Tobias Grosser	00bb5a99f5	GPGPU: Handle scalar array references Pass the content of scalar array references to the alloca on the kernel side and do not pass them additional as normal LLVM scalar value. llvm-svn: 277699	2016-08-04 06:55:59 +00:00
Tobias Grosser	3216f8546c	BlockGenerator: Assert that we do not get alloca of array access llvm-svn: 277698	2016-08-04 06:55:53 +00:00
Tobias Grosser	576932728d	GPGPU: Pass subtree values correctly to the kernel llvm-svn: 277697	2016-08-04 06:55:49 +00:00
Tobias Grosser	629109b633	GPGPU: Mark kernel functions as polly.skip Otherwise, we would try to re-optimize them with Polly-ACC and possibly even generate kernels that try to offload themselves, which does not work as the GPURuntime is not available on the accelerator and also does not make any sense. llvm-svn: 277589	2016-08-03 12:00:07 +00:00
Tobias Grosser	2219d15748	Fix a couple of spelling mistakes llvm-svn: 277569	2016-08-03 05:28:09 +00:00
Roman Gareev	0c09a3af00	Add missing prefixes. llvm-svn: 277264	2016-07-30 11:15:00 +00:00
Roman Gareev	d7754a1245	Extend the jscop interface to allow the user to declare new arrays and to reference these arrays from access expressions Extend the jscop interface to allow the user to export arrays. It is required that already existing arrays of the list of arrays correspond to arrays of the SCoP. Each array that is appended to the list will be newly created. Furthermore, we allow the user to modify access expressions to reference any array in case it has the same element type. Reviewed-by: Tobias Grosser <tobias@grosser.es> Differential Revision: https://reviews.llvm.org/D22828 llvm-svn: 277263	2016-07-30 09:25:51 +00:00
Tobias Grosser	8af38ecaa3	Add missing REQUIRES line llvm-svn: 276964	2016-07-28 07:08:34 +00:00
Tobias Grosser	d8b94bcac1	GPGPU: Pass context parameters to GPU kernel llvm-svn: 276963	2016-07-28 06:47:59 +00:00
Tobias Grosser	a490147c90	GPGPU: Pass host iterators to kernel llvm-svn: 276962	2016-07-28 06:47:56 +00:00
Tobias Grosser	44143bb927	GPGPU: use current 'Index' to find slot in parameter array Before this change we used the array index, which would result in us accessing the parameter array out-of-bounds. This bug was visible for test cases where not all arrays in a scop are passed to a given kernel. llvm-svn: 276961	2016-07-28 06:47:53 +00:00
Tobias Grosser	4e18d71c71	GPGPU: Generate kernel parameter allocation with right size Before this change we miscounted the number of function parameters. llvm-svn: 276960	2016-07-28 06:47:50 +00:00
Tobias Grosser	79a947c233	GPGPU: Add basic support for kernel launches llvm-svn: 276863	2016-07-27 13:20:16 +00:00
Tobias Grosser	5779359624	GPGPU: Load GPU kernels We embed the PTX code into the host IR as a global variable and compile it at run-time into a GPU kernel. llvm-svn: 276645	2016-07-25 16:31:21 +00:00
Johannes Doerfert	8031238017	[GSoC] Add PolyhedralInfo pass - new interface to polly analysis Adding a new pass PolyhedralInfo. This pass will be the interface to Polly. Initially, we will provide the following interface: - #IsParallel(Loop *L) - return a bool depending on whether the loop is parallel or not for the given program order. Patch by Utpal Bora <cs14mtech11017@iith.ac.in> Differential Revision: https://reviews.llvm.org/D21486 llvm-svn: 276637	2016-07-25 12:48:45 +00:00
Tobias Grosser	13c78e4d51	GPGPU: Emit data-transfer code Also factor out getArraySize() to avoid code dupliciation and reorder some function arguments to indicate the direction into which data is transferred. llvm-svn: 276636	2016-07-25 12:47:39 +00:00
Tobias Grosser	7287aeddf1	GPGPU: Complete code to allocate and free device arrays At the beginning of each SCoP, we allocate device arrays for all arrays used on the GPU and we free such arrays after the SCoP has been executed. llvm-svn: 276635	2016-07-25 12:47:33 +00:00
Tobias Grosser	19b8a0bbfb	GPURuntime: Add missing debug output llvm-svn: 276634	2016-07-25 12:47:28 +00:00
Tobias Grosser	9855e8bd80	GPURuntime: Fix typo in docu llvm-svn: 276633	2016-07-25 12:47:25 +00:00
Tobias Grosser	a71eedd4c5	GPURuntime: Drop polly_cleanupGPGPUResources This function is currently unused and won't be used in this form again. Instead of freeing many unrelated items at the same time, we will instead explicitly call free function from the host-IR we generate for each object we want to free. These specific free functions will be added together with the corresponding host-IR generation code. llvm-svn: 276632	2016-07-25 12:47:22 +00:00
Johannes Doerfert	3b7ac0a691	[GSoC] Do not process SCoPs with infeasible runtime context Do not process SCoPs with infeasible runtime context in the new ScopInfoWrapperPass. Do not compute dependences for such SCoPs in the new DependenceInfoWrapperPass. Patch by Utpal Bora <cs14mtech11017@iith.ac.in> Differential Revision: https://reviews.llvm.org/D22402 llvm-svn: 276631	2016-07-25 12:40:59 +00:00
Roman Gareev	3a18a931a8	Apply all necessary tilings and interchangings to get a macro-kernel This is the second patch to apply the BLIS matmul optimization pattern on matmul kernels (http://www.cs.utexas.edu/users/flame/pubs/TOMS-BLIS-Analytical.pdf). BLIS implements gemm as three nested loops around a macro-kernel, plus two packing routines. The macro-kernel is implemented in terms of two additional loops around a micro-kernel. The micro-kernel is a loop around a rank-1 (i.e., outer product) update. In this change we create the BLIS macro-kernel by applying a combination of tiling and interchanging. In subsequent changes we will implement the packing transformation. Reviewed-by: Tobias Grosser <tobias@grosser.es> Differential Revision: http://reviews.llvm.org/D21491 llvm-svn: 276627	2016-07-25 09:42:53 +00:00
Tobias Grosser	fa7b080218	GPGPU: initialize GPU context and simplify the corresponding GPURuntime interface. There is no need to expose the selected device at the moment. We also pass back pointers as return values, as this simplifies the interface. llvm-svn: 276623	2016-07-25 09:16:01 +00:00

... 2 3 4 5 6 ...

2903 Commits