llvm-project

Commit Graph

Author	SHA1	Message	Date
Yuanfang Chen	bb51d24330	Revert "Revert "Reland "[Support] make report_fatal_error `abort` instead of `exit`""" This reverts commit `80a34ae311` with fixes. On bots llvm-clang-x86_64-expensive-checks-ubuntu and llvm-clang-x86_64-expensive-checks-debian only, llc returns 0 for these two tests unexpectedly. I tweaked the RUN line a little bit in the hope that LIT is the culprit since this change is not in the codepath these tests are testing. llvm\test\CodeGen\X86\inline-asm-avx-v-constraint-32bit.ll llvm\test\CodeGen\X86\inline-asm-avx512vl-v-constraint-32bit.ll	2020-02-13 10:02:53 -08:00
Michael Halkenhäuser	1e0be76e98	[Polly] LLVM OpenMP Backend -- Fix "static chunked" scheduling. Static chunked OpenMP scheduling has not been treated correctly. This patch fixes the problem that threads would not process their (work-)chunks as intended. Differential Revision: https://reviews.llvm.org/D61081	2020-02-11 12:51:35 -06:00
Michael Kruse	e8227804ac	[Polly] Update ISL to isl-0.22.1-87-gfee05a13. The primary motivation is to fix an assertion failure in isl_basic_map_alloc_equality: isl_assert(ctx, room_for_con(bmap, 1), return -1); Although the assertion does not occur anymore, I could not identify which of ISL's commits fixed it. Compared to the previous ISL version, Polly requires some changes for this update * Since ISL commit 20d3574 "perform parameter alignment by modifying both arguments to function" isl__gist_ and similar functions do not always align the paramter list anymore. This caused the parameter lists in JScop files to become out-of-sync. Since many regression tests use JScop files with a fixed parameter list and order, we explicitly call align_params to ensure a predictable parameter list. * ISL changed some return types to isl_size, a typedef of (signed) int. This caused some issues where the return type was unsigned int before: - No overload for std::max(unsigned,isl_size) - It cause additional 'mixed signed/unsigned comparison' warnings. Since they do not break compilation, and sizes larger than 2^31 were never supported, I am going to fix it separately. * With the change to isl_size, commit 57d547 "isl__list_size: return isl_size" also changed the return value in case of an error from 0 to -1. This caused undefined looping over isl_iterator since the 'end iterator' got index -1, never reached from the 'begin iterator' with index 0. Some internal changes in ISL caused the number of operations to increase when determining access ranges to determine aliasing overlaps. In one test, this caused exceeding the default limit of 800000. The operations-limit was disabled for this test.	2020-02-10 19:03:08 -06:00
Eli Friedman	2f6b9edfa8	[AliasAnalysis] Add missing FMRB_* enums. Previously, the enums didn't account for all the possible cases, which could cause misleading results (particularly for a "switch" on FunctionModRefBehavior). Fixes regression in polly from recent patch to add writeonly to memset. While I'm here, also fix a few dubious uses of the FMRB_* enum values. Differential Revision: https://reviews.llvm.org/D73154	2020-01-28 15:47:08 -08:00
Eli Friedman	d9e6196312	[polly] XFAIL memset_null.ll. I'm working on a patch, but not sure how long it'll take.	2020-01-21 17:29:44 -08:00
serge_sans_paille	24ab9b537e	Generalize the pass registration mechanism used by Polly to any third-party tool There's quite a lot of references to Polly in the LLVM CMake codebase. However the registration pattern used by Polly could be useful to other external projects: thanks to that mechanism it would be possible to develop LLVM extension without touching the LLVM code base. This patch has two effects: 1. Remove all code specific to Polly in the llvm/clang codebase, replaicing it with a generic mechanism 2. Provide a generic mechanism to register compiler extensions. A compiler extension is similar to a pass plugin, with the notable difference that the compiler extension can be configured to be built dynamically (like plugins) or statically (like regular passes). As a result, people willing to add extra passes to clang/opt can do it using a separate code repo, but still have their pass be linked in clang/opt as built-in passes. Differential Revision: https://reviews.llvm.org/D61446	2020-01-02 16:45:31 +01:00
Fangrui Song	a36ddf0aa9	Migrate function attribute "no-frame-pointer-elim"="false" to "frame-pointer"="none" as cleanups after D56351	2019-12-24 16:27:51 -08:00
Fangrui Song	502a77f125	Migrate function attribute "no-frame-pointer-elim" to "frame-pointer"="all" as cleanups after D56351	2019-12-24 15:57:33 -08:00
Michael Kruse	7be6ec5fa2	[GPGPU] Fix regression test after 395124. Commit 395124 "NVPTX: Don't insert an extra empty line at the end of the last section" changed the length of the kernel payload. Update the regression test to the new binary size.	2019-11-13 06:20:17 +00:00
Michael Kruse	d72637f5cc	[ScopBuilder] Fix bug 38358 by preserving correct order of ScopStmts. ScopBuilder::buildEqivClassBlockStmts creates ScopStmts for instruction groups in basic block and inserts these ScopStmts into Scop::StmtMap, however, as described in llvm.org/PR38358, comment #5, StmtScops are inserted into vector ScopStmt[BB] in wrong order. As a result, ScopBuilder::buildSchedule creates wrong order sequence node. Looking closer to code, it's clear there is no equivalent classes with interleaving isOrderedInstruction(memory access) instructions after joinOrderedInstructions. Afterwards, ScopStmts need to be created and inserted in the original order of memory access instructions, however, at the moment ScopStmts are inserted in the order of leader instructions which are probably not memory access instructions. The fix is simple with a standalone loop scanning isOrderedInstruction(memory access) instructions in basic block and inserting elements into LeaderToInstList one by one. The patch also removes double reversing operations which are now unnecessary. New test preserve-equiv-class-order-in-basic_block.ll is also added. Differential Revision: https://reviews.llvm.org/D68941 llvm-svn: 375192	2019-10-17 23:55:35 +00:00
Tim Northover	1249126c7c	Revert "Update polly test for SCEV change." The motivating SCEV change was reverted as incorrect. llvm-svn: 373185	2019-09-30 07:47:08 +00:00
Michael Kruse	241b02e762	[CodeGen] Handle outlining of CopyStmts. Since the removal of extensions nodes from schedule trees in r362257 it is possible to emit parallel code for SCoPs containing matrix-multiplications. However, the code looking for references used in outlined statement was not prepared to handle CopyStmts introduced by the matrix-matrix multiplication detection. In this case, CopyStmts do not introduce references in addition to the ones captured by MemoryAccesses, i.e. we change the assertion to accept CopyStmts and add a regression test for this case. This fixes llvm.org/PR43164 llvm-svn: 372188	2019-09-17 22:59:43 +00:00
Michael Kruse	87baae85cd	[ScopBuilder] Skip getting leader when merging statements to close holes. Function joinOrderedInstructions merges instructions when a leader is encountered twice. It also notices that leaders in SeenLeaders may lose their leadership in previous merging, and tries to handle the case using following code: Instruction *PrevLeader = UnionFind.getLeaderValue(SeenLeaders.back()); However, this is wrong because it always gets leader for the last element of SeenLeaders, and I believe it's wrong even we get leader for Prev here. As a result, Statements in cases like the one in patch aren't merged as expected. After investigation, I believe it's unnecessary to get leader instruction at all. This is based on fact: Although leaders in SeenLeaders could lose leadership, they only lose to others in SeenLeaders, in other words, one existing leader will be chosen as new leader of merged equivalent statements. We can take advantage of this and simply check if current leader equals to Prev and break merging if it does. The patch also adds a new test. Patch by bin.narwal <bin.narwal@gmail.com> Differential Revision: https://reviews.llvm.org/D67007 llvm-svn: 371801	2019-09-13 01:04:38 +00:00
Michael Kruse	cc0f0582c8	[Polly-ACC] Fix test after IR-printer change. After r367755, even unnamed parameters are printed in IR dumps. Change the test to expect te additional %0 in the line. llvm-svn: 368763	2019-08-13 22:42:08 +00:00
Eli Friedman	c68dd359ae	Update polly test for SCEV change. r366419 adds nsw to more SCEV expressions, which allows polly to make more aggressive assumptions about the input expressions. llvm-svn: 366510	2019-07-18 22:35:45 +00:00
Michael Kruse	88afd75300	[test] Add wrap flags after D61934. https://reviews.llvm.org/D61934, committed as r362687, r363540, r363364 and r363147, made some emitted instruction nus/nsw. Add these falgs to Polly's regression tests. This should fix Polly :: Isl/CodeGen/partial_write_in_region_with_loop.ll Polly :: Isl/CodeGen/scev_expansion_in_nonaffine.ll llvm-svn: 363599	2019-06-17 19:17:07 +00:00
Michael Kruse	aa8a976174	[ScheduleOptimizer] Hoist extension nodes after schedule optimization. Extension nodes make schedule trees are less flexible: Many operations, such as rescheduling, do not work on such schedule trees with extension. As such, some functionality such as determining parallel loops in isl's AST are disabled. Currently, only the pattern-matching generalized matrix-matrix multiplication optimization adds extension nodes (to add copy-in statements). This patch removes all extension nodes as the last step of the schedule optimization by hoisting the extension node's added domain up to the root domain node. All following passes can assume that schedule trees work without restrictions, including the parallelism test. Mark the outermost loop of the optimized matrix-matrix multiplication as parallel such that -polly-parallel is able to parallelize that loop. Differential Revision: https://reviews.llvm.org/D58202 llvm-svn: 362257	2019-05-31 19:26:57 +00:00
Michael Kruse	467069688d	[DeLICM] Use polly::singleton to allow empty result. isl_map_from_union_map cannot determine the map's space if the union_map is empty. polly::singleton was designed for this case. We pass the expected map space to avoid crashing in isl_map_from_union_map. This fixes an issue found by the aosp buildbot. Thanks to Eli Friedman for the reproducer. llvm-svn: 361290	2019-05-21 19:18:26 +00:00
Michael Kruse	c4c679c232	[CodeGen] Fix order of PHINode and MA Write generation. At the end of a region statement, the PHINode must be generated while the current IRBuilder's block is the region's exit node. For obvious reasons: The PHINode references the region's exiting block. A partial write would insert new control flow, i.e. insert new basic blocks between the exiting blocks and the current block. We fix this by generating the PHI nodes (region exit values) before generating any MemoryAccess's stores. This should fix the AOSP buildbot. Reported-by: Eli Friedman <efriedma@quicinc.com> llvm-svn: 361204	2019-05-20 22:31:09 +00:00
Eli Friedman	9b234b388d	[Polly] Don't crash on invalid delinearization result. In certain cases, it's possible for delinearization to decide one of the array dimensions should be some function of an induction variable inside the scop. Make sure if this happens, we refuse to use those dimensions for delinearization. Usually, we end up rejecting the scop before it actually crashes, but it looks like it's possible to slip past other checks in certain cases involving smax expressions. Fixes a crash that started showing up this week on the polly AOSP builder. As far as I can tell, this is a longstanding issue, though; it was just exposed by better SCEV analysis of smin expressions. Differential Revision: https://reviews.llvm.org/D61807 llvm-svn: 360708	2019-05-14 21:32:54 +00:00
Michael Kruse	2698390c68	[ZoneAlgo] Fix PHI inconsistency in invalid contexts. PHI nodes (reads) could point to multiple instances of predecessor blocks (PHI writes) when in an invalid context. Fix by removing PHI instances that are in an invalid or ouside assumed context. This fixes llvm.org/PR41656. llvm-svn: 360454	2019-05-10 18:38:13 +00:00
Keno Fischer	aa1b6f1cfb	[polly][SCEV] Expand SCEV matcher cases for new smin/umin ops These were added in rL360159, but I neglected to update polly at the same time. llvm-svn: 360238	2019-05-08 10:36:04 +00:00
Michael Kruse	89251edefc	[CodeGen] LLVM OpenMP Backend. The ParallelLoopGenerator class is changed such that GNU OpenMP specific code was removed, allowing to use it as super class in a template-pattern. Therefore, the code has been reorganized and one may not use the ParallelLoopGenerator directly anymore, instead specific implementations have to be provided. These implementations contain the library-specific code. As such, the "GOMP" (code completely taken from the existing backend) and "KMP" variant were created. For "check-polly" all tests that involved "GOMP": equivalents were added that test the new functionalities, like static scheduling and different chunk sizes. "docs/UsingPollyWithClang.rst" shows how the alternative backend may be used. Patch by Michael Halkenhäuser <michaelhalk@web.de> Differential Revision: https://reviews.llvm.org/D59100 llvm-svn: 356434	2019-03-19 03:18:21 +00:00
James Y Knight	693d39dd12	Remove irrelevant references to legacy git repositories from compiler identification lines in test-cases. (Doing so only because it's then easier to search for references which are actually important and need fixing.) llvm-svn: 351200	2019-01-15 16:18:52 +00:00
Zachary Turner	4b6f60073a	Fix another error related to YAML quoting. This one occured in polly, which I didn't build / test the first time so I didn't catch it. llvm-svn: 344378	2018-10-12 17:28:39 +00:00
Michael Kruse	7860c5fe4e	[IslAst] Fix InParallelFor nesting. IslAst could mark two nested outer loops as "OutermostParallel". It caused that the code generator tried to OpenMP-parallelize both loops, which it is not prepared loop. It was because the recursive AST build algorithm managed a flag "InParallelFor" to ensure that no nested loop is also marked as "OutermostParallel". Unfortunatetly the same flag was used by nodes marked as SIMD, and reset to false after the SIMD node. Since loops can be marked as SIMD inside "OutermostParallel" loops, the recursive algorithm again tried to mark loops as "OutermostParellel" although still nested inside another "OutermostParallel" loop. The fix exposed another bug: The function "astScheduleDimIsParallel" was only called when a loop was potentially "OutermostParallel" or "InnermostParallel", but as a side-effect also determines the minimum dependence distance. Hence, changing when we need to know whether a loop is "OutermostParallel" also changed which loop was annotated with "#pragma minimal dependence distance". Moreover, some complex condition linked with "InParallelFor" determined whether a loop should be an "InnermostParallel" loop. It missed some situations where it would not use mark as such although being inside an SIMD mark node, and therefore not be annotated using "#pragma simd". The changes in particular: 1. Split the "InParallelFor" flag into an "InParallelFor" and an "InSIMD" flag. 2. Unconditionally call "astScheduleDimIsParallel" for its side-effects and store the result in "InParallel" for later use. 3. Simplify the condition when a loop is "InnermostParallel". Fixes llvm.org/PR33153 and llvm.org/PR38073. llvm-svn: 343212	2018-09-27 13:39:37 +00:00
Tobias Grosser	4beb2f964b	[PerfMonitor] Fix rdtscp callsites Summary: Update all rdtscp callsites in PerfMonitor so that they conform with the signature changes introduced in r341698. Reviewers: grosser, bollu Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D51928 llvm-svn: 341946	2018-09-11 14:17:44 +00:00
Michael Kruse	842bdd0071	[ScopBuilder] Set domain to empty instead of NULL. The domain generation used nullptr to mark the domain of an error block as never-executed. Later, nullptr domains are recreated with a zero-tuple domain that then mismatches with the expected domain the error block within the loop. Instead of using nullptr, assign an empty domain which preserves the expected space. Remove empty domains during SCoP simplification. Fixes llvm.org/PR38218. llvm-svn: 338646	2018-08-01 22:28:32 +00:00
Michael Kruse	60e2a321e5	[test] Remove non-JSPON comments in JSCOP file. NFC. llvm-svn: 338185	2018-07-28 01:11:45 +00:00
Philip Pfaffe	cb8a82929c	Fix for r336080: Missing colon in REQUIRES line llvm-svn: 336083	2018-07-02 08:36:49 +00:00
Philip Pfaffe	d71493cb06	[polly-acc] change cl_get_* return types to 32/64bit Summary: This patch changes the return types for ocl_get_* functions during SPIR code generation. Because these functions return size_t types, the return type needs to be changed to the actual size of size_t on the device. Based on work by Michal Babej and Pekka Jääskeläinen Patch by: Alain Denzler Reviewers: grosser, philip.pfaffe, bollu Reviewed By: grosser, philip.pfaffe Subscribers: nemanjai, kbarton, llvm-commits Differential Revision: https://reviews.llvm.org/D48774 llvm-svn: 336080	2018-07-02 07:40:47 +00:00
Philip Pfaffe	ec1a3048a3	[ScopHelper] Provide support for recognising collective invariant loads Summary: This patch aims to provide support for detecting load patterns which are collectively invariant but right now `isHoistableLoad()` is checking each load instruction individually which cannot detect the load pattern as a whole. Patch by: Sahil Girish Yerawar Reviewers: bollu, philip.pfaffe, Meinersbur Reviewed By: philip.pfaffe, Meinersbur Differential Revision: https://reviews.llvm.org/D48026 llvm-svn: 335949	2018-06-29 07:29:45 +00:00
Tobias Grosser	17a098dedf	test: use regex matchers to make test-case robust against register renumberings Suggested-by: Michael Kruse llvm-svn: 335813	2018-06-28 07:11:48 +00:00
Tim Shen	63f244c4f4	[SCEV] Re-apply r335197 (with Polly fixes). Summary: This initiates a discussion on changing Polly accordingly while re-applying r335197 (D48338). I have never worked on Polly. The proposed change to param_div_div_div_2.ll is not educated, but just patterns that match the output. All LLVM files are already reviewed in D48338. Reviewers: jdoerfert, bollu, efriedma Subscribers: jlebar, sanjoy, hiraditya, llvm-commits, bixia Differential Revision: https://reviews.llvm.org/D48453 llvm-svn: 335292	2018-06-21 21:29:54 +00:00
Tobias Grosser	a78a809afc	Adjust to recent LLVM changes to fix buildbots llvm-svn: 334893	2018-06-16 17:38:19 +00:00
Tobias Grosser	ee5762cfab	[test] Fix a typo in a test case [NFCI] Also remove an undef value that does not add any value to the test case. llvm-svn: 334661	2018-06-13 21:46:29 +00:00
Krzysztof Parzyszek	fb3ed4f409	[Polly] Fix a testcase after LLVM commit r334318 ScalarEvolution has become slightly more intelligent, so obfuscate the exit condition in the testcase some more to keep it working. llvm-svn: 334327	2018-06-08 21:39:55 +00:00
Philip Pfaffe	7cc4300dac	[Acc] Followup for r333105: Fix one additional testcase llvm-svn: 333168	2018-05-24 10:18:09 +00:00
Philip Pfaffe	356d60683b	[Acc] Enable legacy stmt granularity in remaining failing testcases The default statement granularity changed in a recent change by Micheal. To avoid forwad-porting the testcases, enable the legacy behaviour again in these tests. llvm-svn: 333105	2018-05-23 17:46:10 +00:00
Philip Pfaffe	2e171b52ee	[Acc] Update testcases for minor changes in the PPCG mapper and statement naming - A recent ppcg/isl update caused the grid/block size upper bounds to deviate by one from the oracle. This is not an effect that's visible at runtime. - Statement naming changed in polly. Update the testcases. llvm-svn: 333090	2018-05-23 14:56:57 +00:00
Michael Kruse	d6c2ca8dd2	[DeLICM] Avoid assertion on out-of-quota. An assertion was not prepared to be passed a nullptr because the out-of-quota limit was exceeded. Bail-out before the assertion since the assertion does not apply on out-of-quote. This fixes llvm.org/PR37477. llvm-svn: 332488	2018-05-16 16:39:51 +00:00
Eli Friedman	9ae56b9a0e	[SCEVAffinator] Fix handling of pwaff complexity limit. nullptr is not a valid affine expression, and none of the callers check for null, so we eventually hit an isl error and crash. Instead, invalidate the scop and return a constant zero. Differential Revision: https://reviews.llvm.org/D46445 llvm-svn: 332309	2018-05-14 23:05:43 +00:00
Tobias Grosser	6bbca36414	Adjust to debug info metadata format change. Rename variable to retainedNodes. This unbreaks the Polly builds. llvm-svn: 331960	2018-05-10 07:09:10 +00:00
Michael Kruse	e330071b43	[ScopInfo] Remove bail out condition in buildMinMaxAccess(). The condition was introduced in r267142 to mitigate a long compile-time case. In r306087, a max-computation limit was introduced that should handle the same case while leaving the max disjuncts heuristic it should have replaced intact. Today, the max disjuncts bail-out causes problems in that it prematurely stops SCoPs from being detected, e.g. in SPEC's lbm. This would hit less like if isl_set_coalesce would be called after isl_set_remove_divs (which makes more basic_set likely to be coalescable) instead of before. This patch tries to remove the premature max-disjuncts bail-out condition by using simple_hull() to reduce the computational overhead, instead of directly invalidating that SCoP. Differential Revision: https://reviews.llvm.org/D45066 Contributed-by: Sahil Girish Yerawar <cs15btech11044@iith.ac.in> llvm-svn: 331891	2018-05-09 16:23:56 +00:00
Tobias Grosser	1c88d41020	[test] Replace undef with true/false to make test case less fragile This test case does not require undef to be present in branch conditions. Replace these undef values with true/false values to clarify the control-flow required to reach the loop under testing. llvm-svn: 331744	2018-05-08 07:24:05 +00:00
Philip Pfaffe	f1fadea5ce	Pass compiler arguments in the create_ll.sh script Summary: Occasionally you need an include or similar things to be configured when making a new testcase. Allow passing these to the script and down to the compiler calls. Reviewers: grosser, Meinersbur, bollu Reviewed By: Meinersbur Subscribers: bollu, llvm-commits, pollydev Differential Revision: https://reviews.llvm.org/D46359 llvm-svn: 331364	2018-05-02 15:27:32 +00:00
Michael Kruse	e819fffee3	[CodeGen] Print executed statement instances at runtime. Add the options -polly-codegen-trace-stmts and -polly-codegen-trace-scalars. When enabled, adds a call to the beginning of every generated statement that prints the executed statement instance. With -polly-codegen-trace-scalars, it also prints the value of all scalars that are used in the statement, and PHIs defined in the beginning of the statement. Differential Revision: https://reviews.llvm.org/D45743 llvm-svn: 330864	2018-04-25 19:43:49 +00:00
Michael Kruse	beffdb9daa	[ScopDetect] Reject loop with multiple exit blocks. The current statement domain derivation algorithm does not (always) consider that different exit blocks of a loop can have different conditions to be reached. From the code for (int i = n; ; i-=2) { if (i <= 0) goto even; if (i <= 1) goto odd; A[i] = i; } even: A[0] = 42; return; odd: A[1] = 21; return; Polly currently derives the following domains: Stmt_even_critedge Domain := [n] -> { Stmt_even_critedge[] }; Stmt_odd Domain := [n] -> { Stmt_odd[] : (1 + n) mod 2 = 0 and n > 0 }; while the domain for the odd case is correct, Stmt_even is assumed to be executed unconditionally, which is obviously wrong. While projecting out the loop dimension in `adjustDomainDimensions`, it does not consider that there are other exit condition that have matched before. I don't know a how to fix this without changing a lot of code. Therefore This patch rejects loops with multiple exist blocks to fix the miscompile of test-suite's uuencode. The odd condition is transformed by LLVM to %cmp1 = icmp eq i64 %indvars.iv, 1 such that the project_out in adjustDomainDimensions() indeed only matches for odd n (using this condition only, we'd have an infinite loop otherwise). The even condition manifests as %cmp = icmp slt i64 %indvars.iv, 3 Because buildDomainsWithBranchConstraints() does not consider other exit conditions, it has to assume that the induction variable will eventually be lower than 3 and taking this exit. IMHO we need to reuse the algorithm that determines the number of iterations (addLoopBoundsToHeaderDomain) to determine which exit condition applies first. It has to happen in buildDomainsWithBranchConstraints() because the result will need to propagate to successor BBs. Currently addLoopBoundsToHeaderDomain() just look for union of all backedge conditions (which means leaving not the loop here). The patch in llvm.org/PR35465 changes it to look for exit conditions instead. This is required because there might be other exit conditions that do not alternatively go back to the loop header. Differential Revision: https://reviews.llvm.org/D45649 llvm-svn: 330858	2018-04-25 18:53:33 +00:00
Michael Kruse	5369ea5dd5	Allow arbitrary function calls for debugging purposes. Add the switch -polly-debug-func to define the name of a debug function. This function is ignored for any validity check. Its purpose is to allow to observe a value after transformation by a SCoP, and to follow which statements are executed in which order. For instance, consider the following code: static void dbg_printf(int sum, int i) { fprintf(stderr, "The value of sum is %d, i=%d\n", sum, i); fflush(stderr); } void func(int n) { int sum = 0; for (int i = 0; i < 16; i+=1) { sum += i; dbg_printf(sum, i); } } Executing this after Polly's codegen with -polly-debug-func=dbg_printf reveals the new execution order and the assumed values at that point of execution. Differential Revision: https://reviews.llvm.org/D45728 llvm-svn: 330466	2018-04-20 18:55:44 +00:00
Tobias Grosser	fcc3ad5d3c	[ScopDetect / ScopInfo] Get statistics for scops without any loop correctly Make sure we also counts scops not containing any loops. llvm-svn: 330285	2018-04-18 20:03:36 +00:00
Philip Pfaffe	8da7d1d7ee	[NewPM] Update pass registration for the LLVM plugin interface Summary: As of rL329273, LLVM has a mechanism to load new-pm plugins in opt. Use this API in Polly. Reviewers: grosser, Meinersbur, bollu Reviewed By: grosser, Meinersbur Subscribers: lksbhm, bollu, pollydev, llvm-commits Differential Revision: https://reviews.llvm.org/D45484 llvm-svn: 330181	2018-04-17 07:59:46 +00:00
Michael Kruse	4485ae0890	[CodeGen] Allow undefined loads in statement instances outside context. A check in assert-builds was meant to verify that a load provides a value in all statement instances (i.e. its domain). The domain is commonly gist'ed within the parameter context to contain fewer constraints. However, statement instances outside the context are no valid executions, hence the value provided can be undefined. Refine the check for valid loads to only needed to be defined within the SCoP context. In addition, the JSONImporter had to be changed to allow importing access relations that are broader than the current access relation, but still defined over all statement instances. This should fix the compiler crash in test-suite's oggenc of the -polly-process-unprofitable buildbot. llvm-svn: 329655	2018-04-10 01:20:51 +00:00
Michael Kruse	db6f71e48d	[ScopInfo] Avoid iterator invalidation. Commit r329640 introduced the removal of all MemoryAccesses of a Scop. It accidentally continued iterating over a vector whose iterators have been invalidated by a MemoryAccess removal. Make a copy of the MemoryAccesses to remove to iterate over while removing them. llvm-svn: 329653	2018-04-10 01:20:41 +00:00
Michael Kruse	192e7f72ca	[ScopInfo] Completely remove MemoryAccesses when their parent statement is removed. Removing a statement left its MemoryAccesses in some lists and maps of the SCoP. Which lists depends on at which phase of the SCoP construction the statement is deleted. Follow-up passes could still see the already deleted MemoryAccesses by iterating through these lists/maps, resulting in an access violation. When removing a ScopStmt, also remove all its MemoryAccesses by using the same mechnism that removes a MemoryAccess. llvm-svn: 329640	2018-04-09 23:13:05 +00:00
Michael Kruse	df8e140349	Remove immediate dominator heuristic for error block detection. This patch removes the heuristic in - Polly :: lib/Support/ScopHelper.cpp The heuristic forces blocks that directly follow a loop header to not to be considered error blocks. It was introduced in r249611 with the following commit message: > This replaces the support for user defined error functions by a > heuristic that tries to determine if a call to a non-pure function > should be considered "an error". If so the block is assumed not to be > executed at runtime. While treating all non-pure function calls as > errors will allow a lot more regions to be analyzed, it will also > cause us to dismiss a lot again due to an infeasible runtime context. > This patch tries to limit that effect. A non-pure function call is > considered an error if it is executed only in conditionally with > regards to a cheap but simple heuristic. In the code below `CCK_Abort2()` would be considered as an error block, but not `CCK_Abort1()` due to this heuristic. ``` for (int i = 0; i < n; i+=1) { if (ErrorCondition1) CCK_Abort1(); // No __attribute__((noreturn)) if (ErrorCondition2) CCK_Abort2(); // No __attribute__((noreturn)) } ``` This does not seem useful. Checking error conditions in the beginning of some work is quite common. It causes a switch default-case to be not considered an error block in SPEC's cactuBSSN. The comment justifying the heuristic mentions a "load", which does not seem to be applicable here. It has been proposed to remove the heuristic. In addition, the patch fixes the following test cases: - Polly :: ScopDetect/mod_ref_read_pointer.ll - Polly :: ScopInfo/max-loop-depth.ll - Polly :: ScopInfo/mod_ref_access_pointee_arguments.ll - Polly :: ScopInfo/mod_ref_read_pointee_arguments.ll - Polly :: ScopInfo/mod_ref_read_pointer.ll - Polly :: ScopInfo/mod_ref_read_pointers.ll The test cases failed after removing the heuristic. Differential Revision: https://reviews.llvm.org/D45274 Contributed-by: Lorenzo Chelini <l.chelini@icloud.com> llvm-svn: 329548	2018-04-09 06:07:44 +00:00
Huihui Zhang	71e54ccd06	[Polly][IslAst] Fix minimal dependence distance. Summary: When checking the parallelism of a scheduling dimension, we first check if excluding reduction dependences the loop is parallel or not. If the loop is not parallel, then we need to return the minimal dependence distance of all data dependences, including the previously subtracted reduction dependences. Reviewers: grosser, Meinersbur, efriedma, eli.friedman, jdoerfert, bollu Reviewed By: Meinersbur Subscribers: llvm-commits, pollydev Tags: #polly Differential Revision: https://reviews.llvm.org/D45236 llvm-svn: 329214	2018-04-04 18:08:13 +00:00
Tobias Grosser	e5340a8ce9	Move code generation test case to test/CodeGen/ llvm-svn: 327857	2018-03-19 15:05:30 +00:00
Philip Pfaffe	15186d4938	[Polly][CMake] Fix lit setup for building the in the mono repo Summary: When building polly as part of the monorepo (actually, as part of any setup using LLVM_ENABLE_PROJECTS), the LLVMPolly library used in the lit tests ends up in a different directory in the build tree than in an in-tree build Reviewers: Meinersbur, grosser, bollu Reviewed By: Meinersbur Subscribers: mgorny, bollu, pollydev, llvm-commits Differential Revision: https://reviews.llvm.org/D44078 llvm-svn: 326702	2018-03-05 14:43:04 +00:00
Tobias Grosser	b94863001a	[ScopInfo] Do not use the set dimension ids to carry loop information isl does not guarantee that set dimension ids will be preserved, so using them to carry information is not a good idea. Furthermore, the loop information can be derived without problem from the statement itself. As this even requires less code than propagating loop information on set dimension ids, starting from this commit we just derive the loop information in collectSurroundingLoops directly from the IR. Interestingly this also results in a couple of isl sets to take a simpler representation. llvm-svn: 326664	2018-03-03 19:27:54 +00:00
Tobias Grosser	fa8079d0dc	Update isl to isl-0.18-1047-g4a20ef8 This update: - Removes several deprecated functions (e.g., isl_band). - Improves the pretty-printing of sets by detecting modulos and "false" equalities. - Minor improvements to coalescing and increased robustness of the isl scheduler. This update does not yet include isl commit isl-0.18-90-gd00cb45 (isl_pw_*_alloc: add missing check for compatible spaces, Wed Sep 6 12:18:04 2017 +0200), as this additional check is too tight and unfortunately causes two test case failures in Polly. A patch has been submitted to isl and will be included in the next isl update for Polly. llvm-svn: 325557	2018-02-20 07:26:42 +00:00
Michael Kruse	a6716d9d81	[ScopBuilder] scalar-indep: Fix mutually referencing PHIs. Two or more PHIs mutually using each other directly or indirectly as incoming value could cause that a PHI WRITE be added before the PHI READ (i.e. it overwrites the current incoming value with the next incoming value before it being read). Fix by ensuring that the PHI WRITE and PHI READ are in the same statement. This should fix the miscompile of SingleSource/Benchmark/Misc/whetstone from the test-suite. llvm-svn: 324934	2018-02-12 21:09:40 +00:00
Michael Kruse	a43ba2d84f	[ScopBuilder] Make -polly-stmt-granularity=scalar-indep the default. Splitting basic blocks into multiple statements if there are now additional scalar dependencies gives more freedom to the scheduler, but more statements also means higher compile-time complexity. Switch to finer statement granularity, the additional compile time should be limited by the number of operations quota. The regression tests are written for the -polly-stmt-granularity=bb setting, therefore we add that flag to those tests that break with the new default. Some of the tests only fail because the statements are named differently due to a basic block resulting in multiple statements, but which are removed during simplification of statements without side-effects. Previous commits tried to reduce this effect, but it is not completely avoidable. Differential Revision: https://reviews.llvm.org/D42151 llvm-svn: 324169	2018-02-03 06:59:47 +00:00
Michael Kruse	a230f22f4b	[ScopBuilder] Prefer PHI Write accesses in the statement the incoming value is defined. Theoretically, a PHI write can be added to any statement that represents the incoming basic block. We previously always chose the last because the incoming value's definition is guaranteed to be defined. With this patch the PHI write is added to the statement that defines the incoming value. It avoids the requirement for a scalar dependency between the defining statement and the statement containing the write. As such the logic for -polly-stmt-granularity=scalar-indep that ensures that there is such scalar dependencies can be removed. Differential Revision: https://reviews.llvm.org/D42147 llvm-svn: 323284	2018-01-23 23:56:36 +00:00
Dimitry Andric	e6de5a100d	Assume the shared library path variable is LD_LIBRARY_PATH on systems except Darwin and Windows. This prevents inserting an environment variable with an empty name (which is illegal and leads to a Python exception) on any of the BSDs. llvm-svn: 323041	2018-01-20 14:35:05 +00:00
Daniel Neilson	751a2cebc5	Change memcpy/memove/memset to have dest and source alignment attributes (Step 1). Summary: Upstream LLVM is changing the the prototypes of the @llvm.memcpy/memmove/memset intrinsics. This change updates the polly tests for this change. The @llvm.memcpy/memmove/memset intrinsics currently have an explicit argument which is required to be a constant integer. It represents the alignment of the dest (and source), and so must be the minimum of the actual alignment of the two. This change removes the alignment argument in favour of placing the alignment attribute on the source and destination pointers of the memory intrinsic call. For example, code which used to read: call void @llvm.memcpy.p0i8.p0i8.i32(i8* %dest, i8* %src, i32 100, i32 4, i1 false) will now read call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 4 %dest, i8* align 4 %src, i32 100, i1 false) At this time the source and destination alignments must be the same (Step 1). Step 2 of the change, to be landed shortly, will relax that contraint and allow the source and destination to have different alignments. llvm-svn: 322963	2018-01-19 17:12:48 +00:00
Michael Kruse	9cfb0ac223	[ScopBuilder] Revise statement naming when there are multiple statements per BB. The goal is to have -polly-stmt-granularity=bb and -polly-stmt-granularity=scalar-indep to have the same names if there is just one statement per basic block. This fixes a fluke when Polybench's jacobi-2d is optimized differently depending on the -polly-stmt-granularity option, although both options create the same SCoP, just with different statement names. The new naming scheme is: With -polly-use-llvm-names=0: Stmt<BBIdx as decimal><Idx within BB as letter> With -polly-use-llvm-names=1: Stmt_BBName_<Idx within BB as letter> The <Idx within BB> suffix is omitted for the main statement of a BB. The main statement is either the one containing the first store or call (those cannot be removed by the simplifyer), or if there is no such instruction, the first. If after simplification there is just a single statement left, it should be the main statement and have the same names as with -polly-stmt-granularity=bb. Differential Revision: https://reviews.llvm.org/D42136 llvm-svn: 322852	2018-01-18 15:15:50 +00:00
Eli Friedman	a75d53c83f	[polly] [ScopInfo] Don't use isl_val_get_num_si. isl_val_get_num_si crashes on overflow, so don't use it on arbitrary integers. Testcase only crashes on platforms where long is 32 bits because of the signature of isl_val_get_num_si; not sure if it's possible to write a testcase which crashes if long is 64 bits. There are a few other places in polly which use isl_val_get_num_si; they probably need to be fixed as well. I don't think polly uses any of the other "long" isl APIs in an unsafe manner. Differential Revision: https://reviews.llvm.org/D42129 llvm-svn: 322766	2018-01-17 21:59:02 +00:00
Michael Kruse	271deb17b0	[CodeGen] Fix noalias annotations for memcpy/memmove. Memory transfer instructions take two pointers. It is not defined to which of those a noalias annotation applies. To ensure correctness, do not add noalias annotations to memcpy/memmove instructions anymore. The caused a miscompile with test-suite's MultiSource/Applications/obsequi. Since r321138, the MemCpyOpt pass would remove memcpy/memmove calls if known to copy uninitialized memory. In that case, it was initialized by another memcpy, but the annotation for the target pointer said it would not alias. The annotation was actually meant for the source pointer, which was was an alloca and could not alias with the target pointer. llvm-svn: 321371	2017-12-22 17:44:53 +00:00
Michael Kruse	5c2441901f	Fix isl out-of-quota errors affecting later quota guards. If an out-of-quota error occurred, the last error would be isl_error_quota unless a different error occured. We typically check whether the max-operations occured by comparing to that error value after leaving the quota guard. This would check whether there ever was a quota-error, not just in the last quota guards. The observable bug occurred if the max-operations limit was reached in DeLICM, and if -polly-dependences-computout=0, DependenceInfo would think that the quota for computing dependencies was the reason, i.e., fail the operation even if the calculation itself was successful. Fix by reseting the last error to isl_error_none when entering a quota guard, signaling that no quota error occured unless in the guard's scope. llvm-svn: 321329	2017-12-22 01:10:31 +00:00
Michael Kruse	5f0e8a46cf	[ScopBuilder] Split statements on encountering store instructions. Introduce -polly-stmt-granularity=store option. Contributed-by: Nandini Singhal <cs15mtech01004@iith.ac.in> Differential Revision: https://reviews.llvm.org/D37337 llvm-svn: 320360	2017-12-11 12:51:24 +00:00
Philip Pfaffe	f6f8b25e58	[NFC] In GPGPU testcases, replace numeric registers in CHECK directives. Using numeric registers is flaky, since as soon as one additional instruction is generated by us, all the tests need to be adapted. llvm-svn: 319544	2017-12-01 14:16:39 +00:00
Philip Pfaffe	4fe21814d1	Handle Top-Level-Regions in polly::isHoistableLoad Summary: This can be seen as a follow-up on my previous differential [D33411](https://reviews.llvm.org/D33411). We received a bug report where this error was triggered. I have tried my best to recreate the issue in a minimal lit testcase which is also part of this differential. I only handle return instructions as predecessors to a virtual TLR-exit right now. From inspecting the codebase, it seems `unreachable` instructions may also be of interest here. If requested, I can extend my patches to consider them as well. I would also apply this on `ScopHelper.cpp::isErrorBlock` (see D33411), of course. Reviewers: philip.pfaffe, bollu Reviewed By: bollu Subscribers: Meinersbur, pollydev, llvm-commits Tags: #polly Differential Revision: https://reviews.llvm.org/D40492 llvm-svn: 319431	2017-11-30 13:06:10 +00:00
Michael Kruse	163cacb469	[CodeGen] Detect empty domain because of parameters context. Isl does not allow generating isl_ast_expr from an isl_pw_aff that has an empty domain (i.e. has no pieces). We already detected the case if the isl_pw_aff comes with an empty domain. isl_ast_build also considers the domain empty if it is disjoint with the parameter context (e.g. parameters values that we exclude by runtime versioning). Intersect the access relation domain with the parameter context to also detect such practically empty access domains. The effective pointer used in the generated code is unimportand because it will never be executed. This fixes llvm.org/PR35362 llvm-svn: 318806	2017-11-21 22:11:10 +00:00
Philip Pfaffe	00fd43b327	Port ScopInfo to the isl cpp bindings Summary: Most changes are mechanical, but in one place I changed the program semantics by fixing a likely bug: In `Scop::hasFeasibleRuntimeContext()`, I'm now explicitely handling the error-case. Before, when the call to `addNonEmptyDomainConstraints()` returned a null set, this (probably) accidentally worked because isl_bool_error converts to true. I'm checking for nullptr now. Reviewers: grosser, Meinersbur, bollu Reviewed By: Meinersbur Subscribers: nemanjai, kbarton, pollydev, llvm-commits Differential Revision: https://reviews.llvm.org/D39971 llvm-svn: 318632	2017-11-19 22:13:34 +00:00
Michael Kruse	68821a8b91	[ZoneAlgo/ForwardOpTree] Normalize PHIs to their known incoming values. Represent PHIs by their incoming values instead of an opaque value of themselves. This allows ForwardOpTree to "look through" the PHIs and forward the incoming values since forwardings PHIs is currently not supported. This is particularly useful to cope with PHIs inserted by GVN LoadPRE. The incoming values all resolve to a load from a single array element which then can be forwarded. It should in theory also reduce spurious conflicts in value mapping (DeLICM), but I have not yet found a profitable case yet, so it is not included here. To avoid transitive closure and potentially necessary overapproximations of those, PHIs that may reference themselves are excluded from normalization and keep their opaque self-representation. Differential Revision: https://reviews.llvm.org/D39333 llvm-svn: 317008	2017-10-31 16:11:46 +00:00
Michael Kruse	ff426d974d	[DeLICM] Fix wrong assumed access execution order. ForwardOpTree may already transform a scalar access to an array accesses. The access remains implicit (isOriginalScalarKind(), meaning that the access is always executed at the begin/end of a statement), but targets an array (isLatestArrayKind(), which is unrelated to whether the execution is implicit/explicit). Fix by properly using isOriginalXXX() to determine execution order. This fixes the buildbots on MultiSource/Benchmarks/DOE-ProxyApps-C/miniGMG. llvm-svn: 316995	2017-10-31 12:50:25 +00:00
Michael Kruse	06618bf71a	[OpenMP] Fix reference collection of latest base ptrs. When collecting base pointers that need to be made available in parallel subfunctions, use the base pointer associated with the latest ScopArrayInfo, instead of the original one. llvm-svn: 316983	2017-10-31 10:28:22 +00:00
Philip Pfaffe	9b1d1e6ae7	Fix two testcases. NFC intended. Add missing %loadPolly directive to support out of tree builds. One of the changes is somewhat bigger, because the directive turns on LLVM names, and the testcase deosn't use those. llvm-svn: 316870	2017-10-29 21:00:48 +00:00
Michael Kruse	822dfe271b	[ForwardOpTree] Reload know values. For scalar accesses, change the access target to an array element that is known to contain the same value. This may become an alternative to forwardKnownLoad which creates new loads (and therefore closer to forwarding speculatives). Reloading does not require the known value originating from a load, but can be a store as well. Differential Revision: https://reviews.llvm.org/D39325 llvm-svn: 316766	2017-10-27 14:26:14 +00:00
Michael Kruse	b6b65834a1	[Simplify] Mark (and sweep) based on latest access relation. Previously we marked scalars based on the original access function. However, when a scalar read access is redirected, the original definition (or incoming values of a PHI) is not used anymore, and can be deleted (unless referenced by use that has not been redirected). llvm-svn: 316660	2017-10-26 12:34:36 +00:00
Michael Kruse	37d57dac63	[DeLICM] Add more tests for loop layouts. NFC. llvm-svn: 316642	2017-10-26 08:03:28 +00:00
Michael Kruse	19cd61dc11	[DeLICM] Do not try to map to multiple array elements. Add check and skip when the store used to determine the target accesses multiple array elements. Only a single array location should for mapping the scalar. Having multiple creates problems when deciding which element to load from. While MemoryAccess::getAddressFunction() should select just one of them, other problems arise in code that assumes that there is just one target element per statement instance. This fixes llvm.org/PR34989 This also reverts r313902 which fixed llvm.org/PR34485 also caused by a non-functional target array element. This patch avoids the situation to occur in the first place. llvm-svn: 316432	2017-10-24 13:05:24 +00:00
Anna Thomas	0026d91437	[Polly] Add XFAIL to large-numbers-in-boundary-context.ll After rL315683 (improve SCEV to calculate max BETakenCount when end bound of loop is variant and loop is of form {Start,+1, Stride} LT End) this test in polly started failing. However, as discussed in https://reviews.llvm.org/rL315683, this polly test is not a loops bound test and the MaxBECount calculated by SCEV looks correct. The max BECount is the value calculated even when the end bound of loop is invariant. As discussed with Tobias offline, I'm marking this as an XFAIL, until he gets a chance to update the testcase, so the build bot goes to green. llvm-svn: 315912	2017-10-16 15:12:39 +00:00
Michael Kruse	cc345e6e94	[ScopBuilder] Introduce -polly-stmt-granularity=scalar-indep option. The option splits BasicBlocks into minimal statements such that no additional scalar dependencies are introduced. The algorithm is based on a union-find structure, and unites sets if putting them into separate statements would introduce a scalar dependencies. As a consequence, instructions may be split into separate statements such their relative order is different than the statements they are in. This is accounted for instructions whose relative order matters (e.g. memory accesses). The algorithm is generic in that heuristic changes can be made relatively easily. We might relax the order requirement for read-reads or accesses to different base pointers. Forwardable instructions can be made to not cause a join. This implementation gives us a speed-up of 82% in SPEC 2006 456.hmmer benchmark by allowing loop-distribution in a hot loop such that one of the loops can be vectorized. Differential Revision: https://reviews.llvm.org/D38403 llvm-svn: 314983	2017-10-05 13:43:00 +00:00
Tobias Grosser	c52b71db15	[GPGPU] Make sure escaping invariant load hoisted scalars are preserved We make sure that the final reload of an invariant scalar memory access uses the same stack slot into which the invariant memory access was stored originally. Earlier, this was broken as we introduce a new stack slot aside of the preload stack slot, which remained uninitialized and caused our escaping loads to contain garbage. This happened due to us clearing the pre-populated values in EscapeMap after kernel code generation. We address this issue by preserving the original host values and restoring them after kernel code generation. EscapeMap is not expected to be used during kernel code generation, hence we clear it during kernel generation to make sure that any unintended uses are noticed. llvm-svn: 314894	2017-10-04 10:24:23 +00:00
Jakub Kuderski	119753ad14	UnXFAIL tests that previously failed VerifyDFSNumbers They started passing again by the DT::eraseNode fix in r314847. llvm-svn: 314850	2017-10-03 21:23:56 +00:00
Jakub Kuderski	3c3bf74022	XFAIL two test that fail VerifyDFSNumbers DominatorTree check This test XFAILs two test that start to fail when verifying DT's DFS numbers, as per Tobias' suggestion. Related VerifyDFSNumbers patch: D38331. llvm-svn: 314800	2017-10-03 14:31:53 +00:00
Michael Kruse	f5745b4e7d	[ScopBuilder] Build invariant loads separately. Create the MemoryAccesses of invariant loads separately and before all other MemoryAccesses. Invariant loads are classified as synthesizable and therefore are not contained in any statement. When iterating over all instructions of all statements, the invariant loads are consequently not processed and iterating over them separately becomes necessary. This patch can change the order in which MemoryAccesses are created, but otherwise has no functional change. Some temporary code is introduced to ensure correctness, but will be removed in the next commit. llvm-svn: 314664	2017-10-02 11:41:27 +00:00
Michael Kruse	89a6f3db02	[ScopBuilder] Build escaping dependencies separately. Instructions that compute escaping values might be synthesizable and therefore not contained in any ScopStmt. When buildAccessFunctions is changed to only iterate over the instruction list of statement, "free" instructions still need to be written. We do this after the main MemoryAccesses have been created. This can change the order in which MemoryAccesses are created, but has otherwise no functional change. llvm-svn: 314663	2017-10-02 11:41:19 +00:00
Michael Kruse	c013399197	[ScopDetect] Do not add loads out of the SCoP to required invariant loads. Loads before the SCoP are always invariant within the SCoP and therefore are no "required invariant loads". An assertion failes in ScopBuilder when it finds such an invariant load. Fix by not adding such loads to the required invariant load list. This likely will cause the region to be not considered a valid SCoP. We may want to unconditionally accept instructions defined before the region as valid invariant conditions instead of rejecting them. This fixes a compilation crash of SPEC CPU2006 453.povray's render.cpp. llvm-svn: 314636	2017-10-01 22:19:28 +00:00
Tobias Grosser	d215e684b3	Add missing REQUIRES line llvm-svn: 314625	2017-10-01 13:14:40 +00:00
Tobias Grosser	2fb847fbf6	[GPGPU] Set Polly's RTC to false in case invariant load hoisting fails This matches the behavior we already have in lib/Codegen/CodeGeneration.cpp and makes sure that we fall back to the original code. It seems when invariant load hoisting was introduced to the GPGPU backend we missed to reset the RTC flag, such that kernels where invariant load hoisting failed executed the 'optimized' SCoP, which however is set to a simple 'unreachable'. Unsurprisingly, this results in hard to debug issues that are a lot of fun to debug. llvm-svn: 314624	2017-10-01 12:39:14 +00:00
Tobias Grosser	1f93d0f1f9	[ScopInfo] Allow PHI nodes that reference an error block As long as these PHI nodes are only referenced by terminator instructions. llvm-svn: 314212	2017-09-26 15:00:10 +00:00
Tobias Grosser	5e531dfef4	[ScopInfo] Allow invariant loads in branch conditions In case the value used in a branch condition is a load instruction, assume this load to be invariant. llvm-svn: 314146	2017-09-25 20:27:15 +00:00
Tobias Grosser	0a62b2d887	[ScopInfo] Allow uniform branch conditions If all but one branch come from an error condition and the incoming value from this branch is a constant, we can model this branch. llvm-svn: 314116	2017-09-25 16:37:15 +00:00
Tobias Grosser	ee457594c2	[ScopDetect/Info] Look through PHIs that follow an error block In case a PHI node follows an error block we can assume that the incoming value can only come from the node that is not an error block. As a result, conditions that seemed non-affine before are now in fact affine. This is a recommit of r312663 after fixing test/Isl/CodeGen/phi_after_error_block_outside_of_scop.ll llvm-svn: 314075	2017-09-24 09:25:30 +00:00
Tobias Grosser	75d133f0ac	[IslExprBuilder] Do not generate RTC with more than 64 bit Such RTCs may introduce integer wrapping intrinsics with more than 64 bit, which are translated to library calls on AOSP that are not part of the runtime and will consequently cause linker errors. Thanks to Eli Friedman for reporting this issue and reducing the test case. llvm-svn: 314065	2017-09-23 15:32:07 +00:00
Michael Kruse	bfca5f4334	[DeLICM] Allow non-injective PHIRead->PHIWrite mapping. Remove an assertion that tests the injectivity of the PHIRead -> PHIWrite relation. That is, allow a single PHI write to be used by multiple PHI reads. This may happen due to some statements containing the PHI write not having the statement instances that would overwrite the previous incoming value due to (assumed/invalid) contexts. This result in that PHI write is mapped to multiple targets which is not supported. Codegen will select one one of the targets using getAddressFunction(). However, the runtime check should protect us from this case ever being executed. We therefore allow injective PHI relations. Additional calculations to detect/santitize this case would probably not be worth the compuational effort. This fixes llvm.org/PR34485 llvm-svn: 313902	2017-09-21 19:08:23 +00:00
Michael Kruse	6d7a7896ce	[ScopInfo] Use map for value def/PHI read accesses. Before this patch, ScopInfo::getValueDef(SAI) used getStmtFor(Instruction*) to find the MemoryAccess that writes a MemoryKind::Value. In cases where the value is synthesizable within the statement that defines, the instruction is not added to the statement's instruction list, which means getStmtFor() won't return anything. If the synthesiable instruction is not synthesiable in a different statement (due to being defined in a loop that and ScalarEvolution cannot derive its escape value), we still need a MemoryKind::Value and a write to it that makes it available in the other statements. Introduce a separate map for this purpose. This fixes MultiSource/Benchmarks/MallocBench/cfrac where -polly-simplify could not find the writing MemoryAccess for a use. The write was not marked as required and consequently was removed. Because this could in principle happen as well for PHI scalars, add such a map for PHI reads as well. llvm-svn: 313881	2017-09-21 14:23:11 +00:00
Michael Kruse	0e370cf1a7	Check whether IslAstInfo and DependenceInfo were computed for the same Scop. Since -polly-codegen reports itself to preserve DependenceInfo and IslAstInfo, we might get those analysis that were computed by a different ScopInfo for a different Scop structure. This would be unfortunate because DependenceInfo and IslAstInfo hold references to resources allocated by ScopInfo/ScopBuilder/Scop (e.g. isl_id). If -polly-codegen and DependenceInfo/IslAstInfo do not agree on which Scop to use, unpredictable things can happen. When the ScopInfo/Scop object is freed, there is a high probability that the new ScopInfo/Scop object will be created at the same heap position with the same address. Comparing whether the Scop or ScopInfo address is the expected therefore is unreliable. Instead, we compare the address of the isl_ctx object. Both, DependenceInfo and IslAstInfo must hold a reference to the isl_ctx object to ensure it is not freed before the destruction of those analyses which might happen after the destruction of the Scop/ScopInfo they refer to. Hence, the isl_ctx will not be freed and its address not reused as long there is a DependenceInfo or IslAstInfo around. This fixes llvm.org/PR34441 llvm-svn: 313842	2017-09-21 00:01:13 +00:00
Michael Kruse	8dceb76066	[ScheduleOptimizer] Fix and test schedule tree statistics. Fix walking over the schedule tree to collect its properties (Number of permutable bands etc.). Also add regression tests for these statistics. llvm-svn: 313750	2017-09-20 11:53:05 +00:00
Michael Kruse	ef8325ba50	[ForwardOpTree] Test the max operations quota. cl::opt<unsigned long> is not specialized and hence the option -polly-optree-max-ops impossible to use. Replace by supported option cl::opt<unsigned>. Also check for an error state when computing the written value, which happens when the quota runs out. llvm-svn: 313546	2017-09-18 17:43:50 +00:00
Michael Kruse	eac3eebfea	[test] Enable -polly-codegen-verify for regression tests. In r301670 IR verification was disabled. Since then, CodeGen writing malformed IR would only be noticed by unpredictable behavior in follow-up passes (e.g. segfaults, infinite loops) or IR verification in the backend assert builds. Re-enable -polly-codegen-verify at for the regression tests to ensure that malformed IR is detected where Polly generated malformed IR in the past and changes in CodeGen are at least partially covered by check-polly (otherwise malformed IR may only get noticed when the buildbots run the test-suite). Differential Revision: https://reviews.llvm.org/D37969 llvm-svn: 313527	2017-09-18 12:34:11 +00:00
Zachary Turner	ce92db13ea	Resubmit "[lit] Force site configs to run before source-tree configs" This is a resubmission of r313270. It broke standalone builds of compiler-rt because we were not correctly generating the llvm-lit script in the standalone build directory. The fixes incorporated here attempt to find llvm/utils/llvm-lit from the source tree returned by llvm-config. If present, it will generate llvm-lit into the output directory. Regardless, the user can specify -DLLVM_EXTERNAL_LIT to point to a specific lit.py on their file system. This supports the use case of someone installing lit via a package manager. If it cannot find a source tree, and -DLLVM_EXTERNAL_LIT is either unspecified or invalid, then we print a warning that tests will not be able to run. Differential Revision: https://reviews.llvm.org/D37756 llvm-svn: 313407	2017-09-15 22:10:46 +00:00
Zachary Turner	83dcb68468	Revert "[lit] Force site configs to run before source-tree configs" This patch is still breaking several multi-stage compiler-rt bots. I already know what the fix is, but I want to get the bots green for now and then try re-applying in the morning. llvm-svn: 313335	2017-09-15 02:56:40 +00:00
Zachary Turner	a0e55b6403	[lit] Force site configs to be run before source-tree configs This patch simplifies LLVM's lit infrastructure by enforcing an ordering that a site config is always run before a source-tree config. A significant amount of the complexity from lit config files arises from the fact that inside of a source-tree config file, we don't yet know if the site config has been run. However it is always required to run a site config first, because it passes various variables down through CMake that the main config depends on. As a result, every config file has to do a bunch of magic to try to reverse-engineer the location of the site config file if they detect (heuristically) that the site config file has not yet been run. This patch solves the problem by emitting a mapping from source tree config file to binary tree site config file in llvm-lit.py. Then, during discovery when we find a config file, we check to see if we have a target mapping for it, and if so we use that instead. This mechanism is generic enough that it does not affect external users of lit. They will just not have a config mapping defined, and everything will work as normal. On the other hand, for us it allows us to make many simplifications: * We are guaranteed that a site config will be executed first * Inside of a main config, we no longer have to assume that attributes might not be present and use getattr everywhere. * We no longer have to pass parameters such as --param llvm_site_config=<path> on the command line. * It is future-proof, meaning you don't have to edit llvm-lit.in to add support for new projects. * All of the duplicated logic of trying various fallback mechanisms of finding a site config from the main config are now gone. One potentially noteworthy thing that was required to implement this change is that whereas the ninja check targets previously used the first method to spawn lit, they now use the second. In particular, you can no longer run lit.py against the source tree while specifying the various `foo_site_config=<path>` parameters. Instead, you need to run llvm-lit.py. Differential Revision: https://reviews.llvm.org/D37756 llvm-svn: 313270	2017-09-14 16:47:58 +00:00
Roman Gareev	925ce50f1b	Unroll and separate the remaining parts of isolation The remaining parts produced by the full partial tile isolation can contain hot spots that are worth to be optimized. Currently, we rely on the simple loop unrolling pass, LiCM and the SLP vectorizer to optimize such parts. However, the approach can suffer from the lack of the information about aliasing that Polly provides using additional alias metadata or/and the lack of the information required by simple loop unrolling pass. This patch is the first step to optimize the remaining parts. To do it, we unroll and separate them. In case of, for instance, Intel Kaby Lake, it helps to increase the performance of the generated code from 39.87 GFlop/s to 49.23 GFlop/s. The next possible step is to avoid unrolling performed by Polly in case of isolated and remaining parts and rely only on simple loop unrolling pass and the Loop vectorizer. Reviewed-by: Tobias Grosser <tobias@grosser.es> Differential Revision: https://reviews.llvm.org/D37692 llvm-svn: 312929	2017-09-11 17:46:47 +00:00
Michael Kruse	2f5cbc449a	[CodeGen] Bitcast scalar writes to actual value. The type of NewValue might change due to ScalarEvolution looking though bitcasts. The synthesized NewValue therefore becomes the type before the bitcast. llvm-svn: 312718	2017-09-07 12:15:01 +00:00
Michael Kruse	8ee179d3b4	Revert "[ScopDetect/Info] Look through PHIs that follow an error block" This reverts commit r312410 - [ScopDetect/Info] Look through PHIs that follow an error block The commit caused generation of invalid IR due to accessing a parameter that does not dominate the SCoP. llvm-svn: 312663	2017-09-06 19:05:40 +00:00
Michael Kruse	48c726f925	[test] Add forgotten REQUIRES: line. llvm-svn: 312632	2017-09-06 13:11:24 +00:00
Michael Kruse	bd84ce8931	[ZoneAlgo] Handle non-StoreInst/LoadInst MemoryAccesses including memset. Up to now ZoneAlgo considered array elements access by something else than a LoadInst or StoreInst as not analyzable. This patch removes that restriction by using the unknown ValInst to describe the written content, repectively the element type's null value in case of memset. Differential Revision: https://reviews.llvm.org/D37362 llvm-svn: 312630	2017-09-06 12:40:55 +00:00
Michael Kruse	420c4863a9	[Simplify] Actually remove unsed instruction from region header. Since r312249 instructions of a entry block of region statements are not marked as root anymore and hence can theoretically be removed if unused. Theoretically, because the instruction list was not changed. Still, MemoryAccesses for unused instructions were removed. This lead to a failed assertion in the code generator when the MemoryAccess for the still listed instruction was not found. This hould fix the Assertion failed: ArrayAccess && "No array access found for instruction!", file ScopInfo.h, line 1494 compiler crashes. llvm-svn: 312566	2017-09-05 19:44:39 +00:00
Tobias Grosser	d6e0679c4e	[ForwardOp] Remove read accesses for all instructions that have been moved Before this patch, OpTree did not consider forwarding an operand tree consisting of only single LoadInst as useful. The motivation was that, like an access to a read-only variable, it would just replace one MemoryAccess by another. However, in contrast to read-only accesses, this would replace a scalar access by an array access, which is something worth doing. In addition, leaving scalar MemoryAccess is problematic in that VirtualUse prioritizes inter-Stmt use over intra-Stmt. It was possible that the same LLVM value has a MemoryAccess for accessing the remote Stmt's LoadInst as well as having the same LoadInst in its own instruction list (due to being forwarded from another operand tree). With this patch we ensure that if a LoadInst is forwarded is any operand tree, also the operand tree containing just the LoadInst is forwarded as well, which effectively removes the scalar MemoryAccess such that only the array access remains, not both. Thanks Michael for the detailed explanation. Reviewers: Meinersbur, bellu, singam-sanjay, gareevroman Subscribers: hfinkel, pollydev, llvm-commits Tags: #polly Differential Revision: https://reviews.llvm.org/D37424 llvm-svn: 312456	2017-09-03 19:52:15 +00:00
Tobias Grosser	701d943d12	[IslAst] Do not assert in case of empty min/max alias locations In certain situations, the context in the isl_ast_build could result for the min/max locations of our alias sets to become empty, which would cause an internal error in isl, which is then unable to derive a value for these expressions. Check these conditions before code generating expressions and instead assume that alias check succeeded. This is valid, as the corresponding memory accesses will not be executed under any valid context. This fixed llvm.org/PR34432. Thanks to Qirun Zhang for reporting. llvm-svn: 312455	2017-09-03 19:47:19 +00:00
Tobias Grosser	99ccf05694	[ScopHelper] Do not crash on unreachable blocks This resolves llvm.org/PR34433. Thanks to Zhendong Su for reporting. llvm-svn: 312451	2017-09-03 18:01:22 +00:00
Tobias Grosser	4baedc70d1	[ScopDetect/Info] Look through PHIs that follow an error block In case a PHI node follows an error block we can assume that the incoming value can only come from the node that is not an error block. As a result, conditions that seemed non-affine before are now in fact affine. llvm-svn: 312410	2017-09-02 08:25:55 +00:00
Siddharth Bhat	3928e3f50a	[ISLNodeBuilder] Materialize Fortran array sizes of arrays without memory accesses. In Polly, we specifically add a paramter to represent the outermost dimension size of fortran arrays. We do this because this information is statically available from the fortran metadata generated by dragonegg. However, we were only materializing these parameters (meaning, creating an llvm::Value to back the isl_id) from memory accesses. This is wrong, we should materialize parameters from scop array info. It is wrong because if there is a case where we detect 2 fortran arrays, but only one of them is accessed, we may not materialize the other array's dimensions at all. This is incorrect. We fix this by looping over all `polly::ScopArrayInfo` in a scop, rather that just all `polly::MemoryAccess`. Differential Revision: https://reviews.llvm.org/D37379 llvm-svn: 312350	2017-09-01 18:55:43 +00:00
Michael Kruse	0c6c555beb	Fix Memory Access of failing tests. Mark scalar dependences for different statements belonging to same BB as 'Inter'. Contributed-by: Nandini Singhal <cs15mtech01004@iith.ac.in> Differential Revision: https://reviews.llvm.org/D37147 llvm-svn: 312324	2017-09-01 11:36:52 +00:00
Tobias Grosser	2307f86c47	[ForwardOpTree] Allow forwarding in the presence of region statements Summary: After region statements now also have instruction lists, this is a straightforward extension. Reviewers: Meinersbur, bollu, singam-sanjay, gareevroman Reviewed By: Meinersbur Subscribers: hfinkel, pollydev, llvm-commits Tags: #polly Differential Revision: https://reviews.llvm.org/D37298 llvm-svn: 312249	2017-08-31 16:04:49 +00:00
Siddharth Bhat	56572c6a5e	[PPCGCodeGen] Convert intrinsics to libdevice functions whenever possible. This is useful when we face certain intrinsics such as `llvm.exp.*` which cannot be lowered by the NVPTX backend while other intrinsics can. So, we would need to keep blacklists of intrinsics that cannot be handled by the NVPTX backend. It is much simpler to try and promote all intrinsics to libdevice versions. This patch makes function/intrinsic very uniform, and will always try to use a libdevice version if it exists. Differential Revision: https://reviews.llvm.org/D37056 llvm-svn: 312239	2017-08-31 13:03:37 +00:00
Tobias Grosser	c43d0360cc	[BlockGenerator] Generate entry block of regions from instruction lists The adds code generation support for the previous commit. This patch has been re-applied, after the memory issue in the previous patch has been fixed. llvm-svn: 312211	2017-08-31 03:17:35 +00:00
Tobias Grosser	bd15d13d4e	[ScopInfo] Use statement lists for entry blocks of region statements By using statement lists in the entry blocks of region statements, instruction level analyses also work on region statements. We currently only model the entry block of a region statements, as this is sufficient for most transformations the known-passes currently execute. Modeling instructions in the presence of control flow (e.g. infinite loops) is left out to not increase code complexity too much. It can be added when good use cases are found. This change set is reapplied, after a memory corruption issue had been fixed. llvm-svn: 312210	2017-08-31 03:15:56 +00:00
Tobias Grosser	d3edc16416	Revert "[ScopInfo] Use statement lists for entry blocks of region statements" This reverts commit r312128. It aused some memory issues. llvm-svn: 312209	2017-08-31 02:43:49 +00:00
Tobias Grosser	6f1f5cbb5b	Revert "[BlockGenerator] Generate entry block of regions from instruction lists" This reverts commit r312129. It caused some memory issues. llvm-svn: 312208	2017-08-31 02:43:27 +00:00
Adrian Prantl	6120801066	Adapt testcase to LLVM change in DIGlobalVariableExpression. llvm-svn: 312147	2017-08-30 18:12:35 +00:00
Tobias Grosser	1e34508bcc	[BlockGenerator] Generate entry block of regions from instruction lists The adds code generation support for the previous commit. llvm-svn: 312129	2017-08-30 15:08:30 +00:00
Tobias Grosser	6fbe4c8501	[ScopInfo] Use statement lists for entry blocks of region statements By using statement lists in the entry blocks of region statements, instruction level analyses also work on region statements. We currently only model the entry block of a region statements, as this is sufficient for most transformations the known-passes currently execute. Modeling instructions in the presence of control flow (e.g. infinite loops) is left out to not increase code complexity too much. It can be added when good use cases are found. llvm-svn: 312128	2017-08-30 15:08:21 +00:00
Michael Kruse	591255183b	[ScopBuilder] Introduce metadata for splitting scop statement. This patch allows annotating of metadata in ir instruction (with "polly_split_after"), which specifies where to split a particular scop statement. Contributed-by: Nandini Singhal <cs15mtech01004@iith.ac.in> Differential Revision: https://reviews.llvm.org/D36402 llvm-svn: 312107	2017-08-30 10:11:06 +00:00
Michael Kruse	4728184342	[ZoneAlgo] More fine-grained bail-out. ZoneAlgo used to bail out for the complete SCoP if it encountered something violating its assumption. This meant the neither OpTree can forward any load nor DeLICM do anything in such cases, even if their transformations are unrelated to the violations. This patch adds a list of compatible elements (currently with the granularity of entire arrays) that can be used for analysis. OpTree and DeLICM can then check whether their transformations only concern compatible elements, and skip non-compatible ones. This will be useful for e.g. Polybench's benchmarks covariance, correlation, bicg, doitgen, durbin, gramschmidt, adi that have assumption violation, but which are not necessarily relevant for all transformations. Differential Revision: https://reviews.llvm.org/D37219 llvm-svn: 311929	2017-08-28 20:39:07 +00:00
Tobias Grosser	ee8ad1c0ff	[IslAst] Do not compare arrays in alias check which are known to be identical This possibly helps to avoid run-time check failures in the COSMO kernels. llvm-svn: 311920	2017-08-28 20:17:02 +00:00
Tobias Grosser	93ab558d2e	[Detect] Consider nested loop profitable if entry block is not in loop In cases where the entry block of a scop was not contained in a loop that was part of the scop region and at the same time there was a loop surrounding the scop, we missed to count the loops in the scop and consequently did not consider the scop profitable. We correct this by only moving to the loop parent, in case the current loop is loop contained in the scop. This increases the number of loops in COSMO which we assume to be profitable from 3974 to 4981. llvm-svn: 311863	2017-08-27 21:39:25 +00:00
Tobias Grosser	6d0970f64e	Revert "[polly] Fix ScopDetectionDiagnostic test failure caused by r310940" This reverts commit 950849ece9bb8fdd2b41e3ec348b9653b4e37df6. This commit broke various buildbots. llvm-svn: 311692	2017-08-24 19:47:15 +00:00
Michael Kruse	b795bfc0d4	[CodeGen] Detect impossible partial write conditions more reliably. Whether a partial write is tautological/unsatisfiable not only depends on the access domain, but also on the domain covered by its node in the AST. In the example below, there are two instances of Stmt_cond_false. It may have a partial write access that is not executed in instance Stmt_cond_false(0). for (int c0 = 0; c0 < tmp5; c0 += 1) { Stmt_for_body344(c0); if (tmp5 >= c0 + 2) Stmt_cond_false(c0); Stmt_cond_end(c0); } if (tmp5 <= 0) { Stmt_for_body344(0); Stmt_cond_false(0); Stmt_cond_end(0); } Isl cannot derive a subscript for an array element that is never accessed. This caused an error in that no subscript expression has been generated in IslNodeBuilder::createNewAccesses, but BlockGenerator expected one to exist because there is an execution of that write, just not in that ast node. Fixed by instead of determining whether the access domain is empty, inspect whether isl generated a constant "false" ast expression in the current ast node. This should fix a compiler crash of the aosp buildbot. llvm-svn: 311663	2017-08-24 14:51:35 +00:00
Andreas Simbuerger	e478e2de83	[Polly][WIP] Scalar fully indexed expansion Summary: This patch comes directly after https://reviews.llvm.org/D34982 which allows fully indexed expansion of MemoryKind::Array. This patch allows expansion for MemoryKind::Value and MemoryKind::PHI. MemoryKind::Value seems to be working with no majors modifications of D34982. A test case has been added. Unfortunatly, no "run time" checks can be done for now because as @Meinersbur explains in a comment on D34982, DependenceInfo need to be cleared and reset to take expansion into account in the remaining part of the Polly pipeline. There is no way to do that in Polly for now. MemoryKind::PHI is not working. Test case is in place, but not working. To expand MemoryKind::Array, we expand first the write and then after the reads. For MemoryKind::PHI, the idea of the current implementation is to exchange the "roles" of the read and write and expand first the read according to its domain and after the writes. But with this strategy, I still encounter the problem of union_map in new access map. For example with the following source code (source code of the test case) : ``` void mse(double A[Ni], double B[Nj]) { int i,j; double tmp = 6; for (i = 0; i < Ni; i++) { for (int j = 0; j<Nj; j++) { tmp = tmp + 2; } B[i] = tmp; } } ``` Polly gives us the following statements and memory accesses : ``` Statements { Stmt_for_body Domain := { Stmt_for_body[i0] : 0 <= i0 <= 9999 }; Schedule := { Stmt_for_body[i0] -> [i0, 0, 0] }; ReadAccess := [Reduction Type: NONE] [Scalar: 1] { Stmt_for_body[i0] -> MemRef_tmp_04__phi[] }; MustWriteAccess := [Reduction Type: NONE] [Scalar: 1] { Stmt_for_body[i0] -> MemRef_tmp_11__phi[] }; Instructions { %tmp.04 = phi double [ 6.000000e+00, %entry.split ], [ %add.lcssa, %for.end ] } Stmt_for_inc Domain := { Stmt_for_inc[i0, i1] : 0 <= i0 <= 9999 and 0 <= i1 <= 9999 }; Schedule := { Stmt_for_inc[i0, i1] -> [i0, 1, i1] }; MustWriteAccess := [Reduction Type: NONE] [Scalar: 1] { Stmt_for_inc[i0, i1] -> MemRef_tmp_11__phi[] }; ReadAccess := [Reduction Type: NONE] [Scalar: 1] { Stmt_for_inc[i0, i1] -> MemRef_tmp_11__phi[] }; MustWriteAccess := [Reduction Type: NONE] [Scalar: 1] { Stmt_for_inc[i0, i1] -> MemRef_add_lcssa__phi[] }; Instructions { %tmp.11 = phi double [ %tmp.04, %for.body ], [ %add, %for.inc ] %add = fadd double %tmp.11, 2.000000e+00 %exitcond = icmp ne i32 %inc, 10000 } Stmt_for_end Domain := { Stmt_for_end[i0] : 0 <= i0 <= 9999 }; Schedule := { Stmt_for_end[i0] -> [i0, 2, 0] }; MustWriteAccess := [Reduction Type: NONE] [Scalar: 1] { Stmt_for_end[i0] -> MemRef_tmp_04__phi[] }; ReadAccess := [Reduction Type: NONE] [Scalar: 1] { Stmt_for_end[i0] -> MemRef_add_lcssa__phi[] }; MustWriteAccess := [Reduction Type: NONE] [Scalar: 0] { Stmt_for_end[i0] -> MemRef_B[i0] }; Instructions { %add.lcssa = phi double [ %add, %for.inc ] store double %add.lcssa, double* %arrayidx, align 8 %exitcond5 = icmp ne i64 %indvars.iv.next, 10000 } } ``` and the following dependences : ``` { Stmt_for_inc[i0, 9999] -> Stmt_for_end[i0] : 0 <= i0 <= 9999; Stmt_for_inc[i0, i1] -> Stmt_for_inc[i0, 1 + i1] : 0 <= i0 <= 9999 and 0 <= i1 <= 9998; Stmt_for_body[i0] -> Stmt_for_inc[i0, 0] : 0 <= i0 <= 9999; Stmt_for_end[i0] -> Stmt_for_body[1 + i0] : 0 <= i0 <= 9998 } ``` When trying to expand this memory access : ``` { Stmt_for_inc[i0, i1] -> MemRef_tmp_11__phi[] }; ``` The new access map would look like this : ``` { Stmt_for_inc[i0, 9999] -> MemRef_tmp_11__phi_exp[i0] : 0 <= i0 <= 9999; Stmt_for_inc[i0, i1] ->MemRef_tmp_11__phi_exp[i0, 1 + i1] : 0 <= i0 <= 9999 and 0 <= i1 <= 9998 } ``` The idea to implement the expansion for PHI access is an idea from @Meinersbur and I don't understand why my implementation does not work. I should have miss something in the understanding of the idea. Contributed by: Nicolas Bonfante <nicolas.bonfante@gmail.com> Reviewers: Meinersbur, simbuerg, bollu Reviewed By: Meinersbur Subscribers: llvm-commits, pollydev, Meinersbur Differential Revision: https://reviews.llvm.org/D36647 llvm-svn: 311619	2017-08-24 00:04:45 +00:00
Michael Kruse	7fac28fa4f	[ScopDetect] Include zero-iteration loops in loop count. Loop with zero iteration are, syntactically, loops. They have been excluded from the loop counter even for the non-profitable counters. This seems to be unintentially as the sentinel value of '0' minimal iterations does exclude such loops. Fix by never considering the iteration count when the sentinel value of 0 is found. This makes the recently added NumTotalLoops couter redundant with NumLoopsOverall, which now is equivalent. Hence, NumTotalLoops is removed as well. Note: The test case 'ScopDetect/statistics.ll' effectively does not check profitability, because -polly-process-unprofitable is passed to all test cases. llvm-svn: 311551	2017-08-23 13:29:59 +00:00
Jakub Kuderski	0ac1e585fc	[polly] Fix ScopDetectionDiagnostic test failure caused by r310940 Summary: ScopDetection used to check if a loop withing a region was infinite and emitted a diagnostic in such cases. After r310940 there's no point checking against that situation, as infinite loops don't appear in regions anymore. The test failure was observed on these two polly buildbots: http://lab.llvm.org:8011/builders/polly-arm-linux/builds/8368 http://lab.llvm.org:8011/builders/polly-amd64-linux/builds/10310 This patch XFAILs `ReportLoopHasNoExit.ll` and turns infinite loop detection into an assert. Reviewers: grosser, sanjoy, bollu Reviewed By: grosser Subscribers: efriedma, aemerson, kristof.beyls, dberlin, llvm-commits Tags: #polly Differential Revision: https://reviews.llvm.org/D36776 llvm-svn: 311503	2017-08-22 22:01:53 +00:00
Tobias Grosser	4a07bbe3f6	[IRBuilder] Only emit alias scop metadata for arrays, but not scalars Summary: There is no need to emit alias metadata for scalars, as basicaa will easily distinguish them from arrays. This reduces the size of the metadata we generate. This is especially useful after we moved to -polly-position=before-vectorizer, where a lot more scalar dependences are introduced, which increased the size of the alias analysis metadata and made us commonly reach the limits after which we do not emit alias metadata that have been introduced to prevent quadratic growth of this alias metadata. This improves 2mm performance from 1.5 seconds to 0.17 seconds. Reviewers: Meinersbur, bollu, singam-sanjay Reviewed By: Meinersbur Subscribers: pollydev, llvm-commits Tags: #polly Differential Revision: https://reviews.llvm.org/D37028 llvm-svn: 311498	2017-08-22 21:58:48 +00:00
Roman Gareev	0956a606ff	Disable the Loop Vectorizer in case of GEMM Currently, in case of GEMM and the pattern matching based optimizations, we use only the SLP Vectorizer out of two LLVM vectorizers. Since the Loop Vectorizer can get in the way of optimal code generation, we disable the Loop Vectorizer for the innermost loop using mark nodes and emitting the corresponding metadata. Reviewed-by: Tobias Grosser <tobias@grosser.es> Differential Revision: https://reviews.llvm.org/D36928 llvm-svn: 311473	2017-08-22 17:38:46 +00:00
Michael Kruse	a28260f486	[test] Do not pipe binary data to FileCheck. llvm-svn: 311470	2017-08-22 17:09:56 +00:00
Michael Kruse	5b228bbb12	[ScopDetection] Add stat for total number of loops. The total number of loops is useful as a baseline comparing how many loops have been optimized in different configurations. llvm-svn: 311469	2017-08-22 17:09:51 +00:00
Tobias Grosser	6683c81af8	test/GPGPU/invalid-kernel-assert-verifymodule.ll also requires assertions llvm-svn: 311423	2017-08-22 03:12:29 +00:00
Siddharth Bhat	7bc77e87c8	[ScopInfo] Add option to treat all function parameters as dereferencible. Dragonegg generates most function parameters as pointers to the actual parameters. However, it does not mark these parameters with the dereferencable attribute. Polly is conservative when it comes to invariant load hoisting, thus we add runtime checks to invariant load hoisted pointers when we do not know that pointers are dereferencable. This is correct behaviour, but is a performance penalty. Add a flag that allows all pointer parameters to be dereferencable. That way, polly can speculatively load-hoist paramters to functions without runtime checks. Differential Revision: https://reviews.llvm.org/D36461 llvm-svn: 311329	2017-08-21 11:57:04 +00:00
Tobias Grosser	b09bd74da8	[GPGPU] Add llvm.powi to the libdevice supported functions These intrinsics are used in COSMO. llvm-svn: 311324	2017-08-21 09:52:08 +00:00
Tobias Grosser	5170b6627a	[GPGPU] Add log / logf to the libdevice supported functions These two functions are used in COSMO llvm-svn: 311322	2017-08-21 09:00:31 +00:00
Michael Kruse	d091bf8d8e	[MatMul] Make MatMul detection independent of internal isl representations. The pattern recognition for MatMul is restrictive. The number of "disjuncts" in the isl_map containing constraint information was previously required to be 1 (as per isl_*_coalesce - which should ideally produce a domain map with a single disjunct, but does not under some circumstances). This was changed and made more flexible. Contributed-by: Annanay Agarwal <cs14btech11001@iith.ac.in> Differential Revision: https://reviews.llvm.org/D36460 llvm-svn: 311302	2017-08-20 21:31:11 +00:00
Tobias Grosser	e32498c9c3	Revert "[GPGPU] Simplify PPCGSCop to reduce compile time [NFC]" We still see some issues with parameter space mismatches. Revert this to get a clean baseline. We will recommit after these issues have been resolved. This reverts commit 0e360a14194f722ded7aa2bc9d4be2ed2efeeb49. llvm-svn: 311268	2017-08-19 23:49:26 +00:00
Tobias Grosser	ecb94a0392	[GPGPU] Correctly initialize array order and fixed_element information Summary: This information is necessary for PPCG to perform correct life range reordering. With these changes applied we can live-range reorder some of the important kernels in COSMO. We also update and rename one test case, which previously could not be optimized and now is optimized thanks to live-range reordering. To preserve test coverage we add a new test case scalar-writes-in-scop-requires-abort.ll, which exercises our automatic abort in case of scalar writes in the kernel. Reviewers: Meinersbur, bollu, singam-sanjay Subscribers: nemanjai, pollydev, llvm-commits, kbarton Tags: #polly Differential Revision: https://reviews.llvm.org/D36929 llvm-svn: 311259	2017-08-19 20:21:22 +00:00
Philipp Schaad	50139f0f38	[PPCG] Only add Kernel argument sizes for OpenCL, not CUDA runtime Kernel argument sizes now only get appended to the kernel launch parameter list if the OpenCL runtime is selected, not if CUDA runtime is chosen. Differential revision: D36925 llvm-svn: 311248	2017-08-19 17:04:57 +00:00
Tobias Grosser	43df2020e7	[GPGPU] Collect parameter dimension used in MemoryAccesses When using -polly-ignore-integer-wrapping and -polly-acc-codegen-managed-memory we add parameter dimensions lazily to the domains, which results in PPCG not including parameter dimensions that are only used in memory accesses in the kernel space. To make sure these parameters are still passed to the kernel, we collect these parameter dimensions and align the kernel's parameter space before code-generating it. llvm-svn: 311239	2017-08-19 12:58:28 +00:00
Andreas Simbuerger	8d5b257d02	[Polly][Bug fix] Wrong dependences filtering during Fully Indexed expansion Summary: When trying to expand memory accesses, the current version of Polly uses statement Level dependences. The actual implementation is not working in case of multiple dependences per statement. For example in the following source code : ``` void mse(double A[Ni], double B[Nj], double C[Nj], double D[Nj]) { int i,j; for (j = 0; j < Ni; j++) { for (int i = 0; i<Nj; i++) S: B[i] = i; for (int i = 0; i<Nj; i++) T: D[i] = i; U: A[j] = B[j]; C[j] = D[j]; } } ``` The statement U has two dependences with S and T. The current version of polly fails during expansion. This patch aims to fix this bug. For that, we use Reference Level dependences to be able to filter dependences according to statement and memory ref. The principle of expansion remains the same as before. We also noticed that we need to bail out if load come after store (at the same position) in same statement. So a check was added to isExpandable. Contributed by: Nicholas Bonfante <nicolas.bonfante@insa-lyon.fr> Reviewers: Meinersbur, simbuerg, bollu Reviewed By: Meinersbur, simbuerg Subscribers: pollydev, llvm-commits Differential Revision: https://reviews.llvm.org/D36791 llvm-svn: 311165	2017-08-18 15:01:18 +00:00

1 2 3 4 5 ...

1493 Commits