llvm-project

Commit Graph

Author	SHA1	Message	Date
Siddharth Bhat	b2f754e39f	[Docs] Fix Sphinx documentation in CMake check. Summary: - `include(AddSphinxTarget)` needs to occur before checking `SPHINX_FOUND`. - `docs-polly-html` and `docs-polly-man` are now usable again. - Perhaps we should build docs in the CI as well? Differential Revision: https://reviews.llvm.org/D33386 llvm-svn: 303549	2017-05-22 13:16:02 +00:00
Michael Kruse	706f79ab14	[CodeGen] Support partial write accesses. Allow the BlockGenerator to generate memory writes that are not defined over the complete statement domain, but only over a subset of it. It generates a condition that evaluates to 1 if executing the subdomain, and only then execute the access. Only write accesses are supported. Read accesses would require a PHINode which has a value if the access is not executed. Partial write makes DeLICM able to apply mappings that are not defined over the entire domain (for instance, a branch that leaves a loop with a PHINode in its header; a MemoryKind::PHI write when leaving is never read by its PHI read). Differential Revision: https://reviews.llvm.org/D33255 llvm-svn: 303517	2017-05-21 22:46:57 +00:00
Tobias Grosser	7be8245a40	[ScopInfo] Translate updateDimensionality to isl C++ [NFC] llvm-svn: 303514	2017-05-21 20:38:33 +00:00
Tobias Grosser	a3f7546931	[isl++] add isl_constraint to C++ bindings [NFC] llvm-svn: 303512	2017-05-21 20:23:26 +00:00
Tobias Grosser	3137f2cb65	[ScopInfo] Translate wrapConstantDimensions to isl C++ [NFC] llvm-svn: 303511	2017-05-21 20:23:23 +00:00
Tobias Grosser	99ea1d0808	[ScopInfo] Translate addRangeBoundsToSet to isl C++ [NFC] llvm-svn: 303510	2017-05-21 20:23:20 +00:00
Tobias Grosser	1f94dcee0b	Fix include order to stop clang-format complains llvm-svn: 303509	2017-05-21 16:34:09 +00:00
Tobias Grosser	7205f93a98	[ScheduleOptimizer] Move schedule construction to isl C++ [NFC] llvm-svn: 303508	2017-05-21 16:21:33 +00:00
Tobias Grosser	b5f61bdeeb	[Simplify] Move to isl C++ llvm-svn: 303507	2017-05-21 16:12:21 +00:00
Tobias Grosser	6151654c00	[isl++] Export (almost) all functions from isl This commit exports the majority of the isl functions to the isl C++ interface. The official isl C++ bindings still require discussions to define the set of functions that are officially supported. As a result, the officially exported functionality will be rather limited until these discussions conclude and a non-trivial set of isl functions is officially supported through the isl C++ bindings. Starting from this commit we ship with Polly an extended version of the official isl C++ bindings to ensure sufficient functionality is available such that LLVM developers can make efficient use of isl through C++. The practical experience Polly gathers with its bindings will then be used to gradually upstream patches to isl to extend the official bindings. llvm-svn: 303506	2017-05-21 16:00:32 +00:00
Tobias Grosser	443f6814a1	[isl++] Rebase isl C++ bindings on top of 29aee98ce This reduces the diff to the official isl C++ bindings and solves a correctness issue with isl::booleans, where isl_bool_error results were accidentally converted to isl::boolean::true. llvm-svn: 303505	2017-05-21 15:59:15 +00:00
Tobias Grosser	3320485961	[isl++] Move isl raw_ostream printers into separate header Instead of relying on these functions to be part of the isl C++ bindings, we just define this functionality independently. This allows us to use isl C++ bindings that do not contain LLVM specific functionality. llvm-svn: 303503	2017-05-21 13:16:05 +00:00
Tobias Grosser	ee61ebb134	Fix buildbots after r303429 A test case with a GPU runline was added without setting 'REQUIRES=pollyacc'. We drop the GPU run line, as the basic functionality can already be tested with the normal code generation. llvm-svn: 303485	2017-05-20 04:22:26 +00:00
Siddharth Bhat	b7f68b8c9e	[Fortran Support] Materialize outermost dimension for Fortran array. - We use the outermost dimension of arrays since we need this information to generate GPU transfers. - In general, if we do not know the outermost dimension of the array (because the indexing expression is non-affine, for example) then we simply cannot generate transfer code. - However, for Fortran arrays, we can use the Fortran array representation which stores the dimensions of all arrays. - This patch uses the Fortran array representation to generate code that computes the outermost dimension size. Differential Revision: https://reviews.llvm.org/D32967 llvm-svn: 303429	2017-05-19 15:07:45 +00:00
Tobias Grosser	d8945baa0a	[ScopDetection] Allow detection of full functions This is useful when only analyzing functions. llvm-svn: 303420	2017-05-19 12:13:02 +00:00
Tobias Grosser	977158488e	[ScopInfo] Fix typo in documentation llvm-svn: 303405	2017-05-19 04:01:52 +00:00
Tobias Grosser	45e9fd1810	[ScopInfo] Gracefully handle long compile times The following test case tried to compute the lexicographic minimum of the following set during alias analysis, which caused very long compile time: [p_0, p_1, p_2, p_3, p_4, p_5] -> { MemRef0[i0] : (517p_3 >= 70944 - 298p_2 and 256i0 >= -71199 + 298p_2 + 517p_3 and 256i0 <= -70944 + 298p_2 + 517p_3) or (409p_4 >= 57120 - 298p_2 and 256i0 >= -57375 + 298p_2 + 409p_4 and 256i0 <= -57120 + 298p_2 + 409p_4) or (104p_4 >= 17329 + 149p_2 - 50p_3 and 128i0 >= 17328 + 149p_2 - 50p_3 - 104p_4 and 128i0 <= 17455 + 149p_2 - 50p_3 - 104p_4) or (104p_4 <= 17328 + 149p_2 - 50p_3 and 128i0 >= 17201 + 149p_2 - 50p_3 - 104p_4 and 128i0 <= 17328 + 149p_2 - 50p_3 - 104p_4) or (409p_4 <= 57119 - 298p_2 and 256i0 >= -57120 + 298p_2 + 409p_4 and 256i0 <= -56865 + 298p_2 + 409p_4) or (517p_3 <= 70943 - 298p_2 and 256i0 >= -70944 + 298p_2 + 517p_3 and 256i0 <= -70689 + 298p_2 + 517p_3) or (p_1 >= 2 + 2p_0 and 298p_5 >= 70944 - 517p_3 and 256i0 >= -71199 + 517p_3 + 298p_5 and 256i0 <= -70944 + 517p_3 + 298p_5) or (p_1 >= 2 + 2p_0 and 298p_5 >= 57120 - 409p_4 and 256i0 >= -57375 + 409p_4 + 298p_5 >and 256i0 <= -57120 + 409p_4 + 298p_5) or (p_1 >= 2 + 2p_0 and 149p_5 <= -17329 >+ 50p_3 + 104p_4 and 128i0 >= 17328 - 50p_3 - 104p_4 + 149p_5 and 128i0 <= >17455 - 50p_3 - 104p_4 + 149p_5) or (p_1 >= 2 + 2p_0 and 149p_5 >= -17328 + >50p_3 + 104p_4 and 128i0 >= 17201 - 50p_3 - 104p_4 + 149p_5 and 128i0 <= 17328 >- 50p_3 - 104p_4 + 149p_5) or (p_1 >= 2 + 2p_0 and 298p_5 <= 57119 - 409p_4 and >256i0 >= -57120 + 409p_4 + 298p_5 and 256i0 <= -56865 + 409p_4 + 298p_5) or >(p_1 >= 2 + 2p_0 and 298p_5 <= 70943 - 517p_3 and 256i0 >= -70944 + 517p_3 + >298p_5 and 256i0 <= -70689 + 517p_3 + 298p_5) } We now guard the potentially expensive functions in Polly's scop analysis to gracefully bail out in case of overly long compilation times. llvm-svn: 303404	2017-05-19 03:45:00 +00:00
Michael Kruse	960c0d0b04	[ScopInfo] Fix r302231 to use logical or (\|\|). NFC. In r302231 we mistakenly use bitwise or (\|) instead of logical or (\|\|). This patch fixes that. Contributed-by: Sameer AbuAsal <sabuasal@codeaurora.org> Differential Revision: https://reviews.llvm.org/D33337 llvm-svn: 303386	2017-05-18 21:55:36 +00:00
Reid Kleckner	96ab8726a3	[IR] De-virtualize ~Value to save a vptr Summary: Implements PR889 Removing the virtual table pointer from Value saves 1% of RSS when doing LTO of llc on Linux. The impact on time was positive, but too noisy to conclusively say that performance improved. Here is a link to the spreadsheet with the original data: https://docs.google.com/spreadsheets/d/1F4FHir0qYnV0MEp2sYYp_BuvnJgWlWPhWOwZ6LbW7W4/edit?usp=sharing This change makes it invalid to directly delete a Value, User, or Instruction pointer. Instead, such code can be rewritten to a null check and a call Value::deleteValue(). Value objects tend to have their lifetimes managed through iplist, so for the most part, this isn't a big deal. However, there are some places where LLVM deletes values, and those places had to be migrated to deleteValue. I have also created llvm::unique_value, which has a custom deleter, so it can be used in place of std::unique_ptr<Value>. I had to add the "DerivedUser" Deleter escape hatch for MemorySSA, which derives from User outside of lib/IR. Code in IR cannot include MemorySSA headers or call the MemoryAccess object destructors without introducing a circular dependency, so we need some level of indirection. Unfortunately, no class derived from User may have any virtual methods, because adding a virtual method would break User::getHungOffOperands(), which assumes that it can find the use list immediately prior to the User object. I've added a static_assert to the appropriate OperandTraits templates to help people avoid this trap. Reviewers: chandlerc, mehdi_amini, pete, dberlin, george.burgess.iv Reviewed By: chandlerc Subscribers: krytarowski, eraman, george.burgess.iv, mzolotukhin, Prazek, nlewycky, hans, inglorion, pcc, tejohnson, dberlin, llvm-commits Differential Revision: https://reviews.llvm.org/D31261 llvm-svn: 303362	2017-05-18 17:24:10 +00:00
Siddharth Bhat	06e3c74d83	[Fortran Support] Change "global" pattern match to work for params Summary: - Rename global / local naming convention that did not make much sense to Visible / Invisible, where the visible refers to whether the ALLOCATE call to the Fortran array is present in the current module or not. - This match now works on both cross fortran module globals and on parameters to functions since neither of them are necessarily allocated at the point of their usage. - Add testcase that matches against both a load and a store against function parameters. Differential Revision: https://reviews.llvm.org/D33190 llvm-svn: 303356	2017-05-18 16:47:13 +00:00
Michael Kruse	1198b1f8d6	[ScopInfo] Remove unused MemoryAccess::BaseName. NFC. llvm-svn: 303189	2017-05-16 16:52:24 +00:00
Tobias Grosser	e890d5ba1b	Drop nonexisting ScopPassManager directory llvm-svn: 303066	2017-05-15 14:12:30 +00:00
Tobias Grosser	ff3f38b2c5	Adjust formatting llvm-svn: 303065	2017-05-15 14:12:27 +00:00
Philip Pfaffe	762ec5a3eb	[Polly][NewPM] Add missing Unittests llvm-svn: 303064	2017-05-15 13:52:10 +00:00
Philip Pfaffe	35bdcaf9e9	[Polly][NewPM][WIP] Add a ScopPassManager This patch adds both a ScopAnalysisManager and a ScopPassManager. The ScopAnalysisManager is itself a Function-Analysis, and manages analyses on Scops. The ScopPassManager takes care of building Scop pass pipelines. This patch is marked WIP because I've left two FIXMEs which I need to think about some more. Both of these deal with invalidation: Deferred invalidation is currently not implemented. Deferred invalidation deals with analyses which cache references to other analysis results. If these results are invalidated, invalidation needs to be propagated into the caching analyses. The ScopPassManager as implemented assumes that ScopPasses do not affect other Scops in any way. There has been some discussion about this on other patch threads, however it makes sense to reiterate this for this specific patch. I'm uploading this patch even though it's incomplete to encourage discussion and give you an impression of how this is going to work. Differential Revision: https://reviews.llvm.org/D33192 llvm-svn: 303062	2017-05-15 13:43:01 +00:00
Philip Pfaffe	bbb86719c1	[Polly][CMake] Exclude isl_config from the polly-check-format target. Summary: The custom `polly-check-format` target runs clang-format over all source files in the directory tree excluding lib/External. `isl_config.h` is a header file that is generated by CMake in the build directory, and it's not correctly formatted (which I also wouldn't consider necessary, as it is a generated file). If the build directory is actually inside the Polly source directory (which it might be if you're building Polly out-of-tree), that check always fails. Hence this patch excludes this file from the check-format target. Reviewers: Meinersbur, grosser Reviewed By: grosser Subscribers: mgorny, llvm-commits, pollydev Tags: #polly Differential Revision: https://reviews.llvm.org/D33192 llvm-svn: 303060	2017-05-15 13:20:26 +00:00
Philip Pfaffe	3030bf0c81	[Polly][Fortran Support] Fix two testcases for the loadable-library use-case llvm-svn: 303057	2017-05-15 12:58:31 +00:00
Philip Pfaffe	838e0884ef	[Polly][NewPM] Port ScopInfo to the new PassManager llvm-svn: 303056	2017-05-15 12:55:14 +00:00
Siddharth Bhat	aed4b5682d	[NFC] [Fortran Support] Fix findFADGlobalNonAlloc pattern match comment llvm-svn: 303052	2017-05-15 11:49:19 +00:00
Siddharth Bhat	0fe7231a2f	[Fortran Support] Add pattern match for Fortran Arrays that are parameters. - This breaks the previous assumption that Fortran Arrays are `GlobalValue`. - The names of functions were getting unwieldy. So, I renamed the Fortran related functions. Differential Revision: https://reviews.llvm.org/D33075 llvm-svn: 303040	2017-05-15 08:41:30 +00:00
Siddharth Bhat	9746f817ea	[Simplify] Fix r302986 that introduced non-inferrable templates. - auto + decltype + template use was not inferrable in `Transform/Simplify.cpp accessesInOrder`. - changed code to explicitly construct required vector instead of using higher order iterator helpers. - Failing compiler spec: Apple LLVM version 7.3.0 (clang-703.0.31) Target: x86_64-apple-darwin15.6.0 llvm-svn: 303039	2017-05-15 08:18:51 +00:00
Tobias Grosser	497fdd7dff	[Simplify] Remove some leftover dead code llvm-svn: 303007	2017-05-14 09:20:56 +00:00
Tobias Grosser	b693f42b71	[Polly] Fix code generation of llvm.expect intrinsic At the time of code generation, an instruction with an llvm intrinsic is ignored in copyBB. However, if the value of the instruction is used later in the program, the value needs to be synthesized. However, this is causing some issues with the instructions being generated in a hoisted basic block. Removing llvm.expect from the list of ignored intrinsics fixes this bug. This resolves http://llvm.org/PR32324. Contributed-by: Annanay Agarwal <cs14btech11001@iith.ac.in> Tags: #polly Differential Revision: https://reviews.llvm.org/D32992 llvm-svn: 303006	2017-05-14 09:09:54 +00:00
Michael Kruse	fa7be88378	[Simplify] Remove identical write removal. NFC. Removal of overwritten writes currently encompasses all the cases of the identical write removal. There is an observable behavioral change in that the last, instead of the first, MemoryAccess is kept. This should not affect the generated code, however. Differential Revision: https://reviews.llvm.org/D33143 llvm-svn: 302987	2017-05-13 12:20:57 +00:00
Michael Kruse	f263610b82	[Simplify] Remove writes that are overwritten. Remove memory writes that are overwritten by later writes. This works for StoreInsts: store double 21.0, double* %A store double 42.0, double* %A scalar writes at the end of a statement and mixes of these. Multiple writes can be the result of DeLICM, which might map multiple writes to the same location when it knows that these do no conflict (for instance because they write the same value). Such writes interfere with pattern-matched optimization such as gemm and may not get removed by other LLVM passes after code generation. Differential Revision: https://reviews.llvm.org/D33142 llvm-svn: 302986	2017-05-13 11:49:34 +00:00
Michael Kruse	aeb4864090	[Simplify] Reset all stats between runs. llvm-svn: 302926	2017-05-12 17:23:07 +00:00
Philip Pfaffe	5cc87e3ab3	[Polly][NewPM] Port ScopDetection to the new PassManager Summary: This is a proof of concept of how to port polly-passes to the new PassManager architecture. This approach works ootb for Function-Passes, but might not be directly applicable to Scop/Region-Passes. While we could just run the Analyses/Transforms over functions instead, we'd surrender the nice pipelining behaviour we have now. Reviewers: Meinersbur, grosser Reviewed By: grosser Subscribers: pollydev, sanjoy, nemanjai, llvm-commits Tags: #polly Differential Revision: https://reviews.llvm.org/D31459 llvm-svn: 302902	2017-05-12 14:37:29 +00:00
Siddharth Bhat	d0d29addf9	[NFC] [Fortran Support] Run -instnamer on testcases llvm-svn: 302892	2017-05-12 12:36:04 +00:00
Siddharth Bhat	f16db04cd5	[FIX] Fix regression caused by `c29f4ed`, testcase matches output - Commit changed codegen for induction variables - Updated testcase llvm-svn: 302891	2017-05-12 11:34:51 +00:00
Philip Pfaffe	cda7152fcb	[Polly][CMake] Fix variable name in target exports llvm-svn: 302888	2017-05-12 10:39:38 +00:00
Siddharth Bhat	c05fcc0d9e	[NFC] [Fortran Support] Cleanup Fortran Array pattern mactch testcases - Move the testcases to ScopInfo/ since the processing takes place in ScopBuilder. - Cleanup testcases, run -polly-canonicalize on them, find minimal set of opt parameters. llvm-svn: 302886	2017-05-12 09:37:39 +00:00
Hongbin Zheng	5b263d4ce1	[Polly] Remove unused header llvm-svn: 302868	2017-05-12 02:21:50 +00:00
Hongbin Zheng	4fe342cb75	[Polly] Generate more 'canonical' induction variable Today Polly generates induction variable in this way: polly.indvar = phi 0, polly.indvar.next ... polly.indvar.next = polly.indvar + stide polly.loop_cond = predicate polly.indvar, (UB - stride) Instead of: polly.indvar = phi 0, polly.indvar.next ... polly.indvar.next = polly.indvar + stide polly.loop_cond = predicate polly.indvar.next, UB The way Polly generate induction variable cause some problem in the indvar simplify pass. This patch make polly generate the later form, by assuming the induction variable never overflow Differential Revision: https://reviews.llvm.org/D33089 llvm-svn: 302866	2017-05-12 02:17:15 +00:00
Michael Kruse	d644ec7647	[DeLICM] Use input access heuristic for mapped PHI WRITEs. As with the scalar operand of the initial StoreInst, also use input accesses when searching for new opportunities after mapping a PHI write. The same rational applies here: After LICM has been applied, the promoted value will either be an instruction in the same statement (in which case we fall back to try every scalar access of the statement), or in another statement such that there will be such an input access. In the latter case other scalars cannot have originated from the same register promotion, at least not by LICM. This mostly helps to decrease compilation time and makes debugging easier by not pursuing unpromising routes. In some circumstances, it may change the compiler's output. llvm-svn: 302839	2017-05-11 22:56:59 +00:00
Michael Kruse	4c27643398	[DeLICM] Lookup input accesses. Previous to this patch, we used VirtualUse to determine the input access of an llvm::Value in a statement. The input access is the READ MemoryAccess that makes a value available in that statement, which can either be a READ of a MemoryKind::Value or the MemoryKind::PHI for a PHINode in the statement. DeLICM uses the input access to heuristically find a candidate to map without searching all possible values. This might modify the behaviour in that previously PHI accesses were not considered input accesses before. This was unintentially lost when "VirtualUse" was extracted from the "Known Knowledge" patch. llvm-svn: 302838	2017-05-11 22:56:46 +00:00
Michael Kruse	bfaa1857b3	[VirtualInstruction] Do a lookup instead of a linear search. NFC. llvm-svn: 302837	2017-05-11 22:56:27 +00:00
Michael Kruse	e60eca7316	[ScopInfo] Keep scalar acceess dictionaries up-to-data. NFC. When removing a MemoryAccess, also remove it from maps pointing to it. This was already done for InstructionToAccess, but not yet for ValueReads, ValueWrites and PHIWrites as those were only used during the ScopBuilder phase. Keeping them updated allows us to use them later as well. llvm-svn: 302836	2017-05-11 22:56:12 +00:00
Michael Kruse	07e315e780	[Simplify] Remove identical scalar writes. After DeLICM, it is possible to have two writes of the same value to the same location in the same statement when it determined that those writes do not conflict (write the same value). Teach -polly-simplify to remove one of the writes. It interferes with the pattern matching of matrix-multiplication kernels and also seem to not be optimized away by LLVM. The algorthm is simple, has O(n^2) behaviour (n = max number of MemoryAccesses in a statement) and only matches the most obvious cases, but seem to be enough to pattern-match Boost ublas gemm. Not handled cases include: - StoreInst instructions (a.k.a. explicit writes), since the value might be loaded or overwritten between the two stores. - PHINode, especially LCSSA, when the PHI value matches with on other's. - Partial writes (in preparation) llvm-svn: 302805	2017-05-11 15:07:38 +00:00
Siddharth Bhat	abea18feba	[NFC] [Fortran Support] move Fortran array detection testcases move these testcases to where they belong: ScopDetect llvm-svn: 302735	2017-05-10 21:35:14 +00:00
Michael Kruse	a0987b83d5	[Simplify] Mark variables as used. NFC. Mark one more variable as used that is needed in assertions. llvm-svn: 302726	2017-05-10 20:45:10 +00:00
Michael Kruse	4aac59cee1	[Simplify] Mark variables as used. NFC. Mark variables as used that are needed in assertions. llvm-svn: 302725	2017-05-10 20:42:02 +00:00
Siddharth Bhat	f5c81fb199	[Fix][Fortran Support] Don't use -debug-only in pattern matching test cases -debug-only is unnecessary and causes the tests to break in Release mode. Remove the option to opt in the test cases. llvm-svn: 302722	2017-05-10 20:10:17 +00:00
Michael Kruse	f41f274bf8	[DeLICM] Avoid compiler warning. NFC. gcc 5.4 warns about using a C-style case to case away a const. Use case a const_cast instead. llvm-svn: 302715	2017-05-10 19:58:52 +00:00
Michael Kruse	f69a7c306b	[DeLICM] Always normalize domain. NFC. Some isl functions can simplify their __isl_keep arguments. The argument object after the call uses different contraints to represent the same set. Different contraints can result in different outputs when printed to a string. In assert builds additional isl functions are called (in assert() or mentioned, these can change the internal representation of its read-only arguments such that printed strings are different in debug and non-debug builds. What happened here is that a call to isl_set_is_equal inside an assert in getScatterFor normalizes one of its arguments such that one redundant constraint is removed. The redundant constraint therefore does not appear in the string representing the domain, which FileCheck notices as a regression test failure compared to a build with assertions disabled. This fix removes the redundant contraints the domain from the start such that the redundant contraint is removed in assert and non-assert builds. Isl adds a flag to such sets such that the removal of redundancies is not done multiple times (here: by isl_set_is_equal). Thanks to Tobias Grosser for reporting and hinting to the cause. llvm-svn: 302711	2017-05-10 19:50:45 +00:00
Siddharth Bhat	c47f039efd	[Fix] [Fortran Support] Fix variable name & make testcase activate on release There was: #ifdef NDEBUG This should be: #ifndef NDEBUG Also, the variable name was incorrect. Fixed the variable name. llvm-svn: 302696	2017-05-10 17:27:48 +00:00
Philip Pfaffe	d399607f65	[Polly][CMake] Fix syntactical errors in the exported config llvm-svn: 302657	2017-05-10 13:51:30 +00:00
Siddharth Bhat	f2dbba8183	[Fortran Support] Detect Fortran arrays & metadata from dragonegg output Add the ability to tag certain memory accesses as those belonging to Fortran arrays. We do this by pattern matching against known patterns of Dragonegg's LLVM IR output from Fortran code. Fortran arrays have metadata stored with them in a struct. This struct is called the "Fortran array descriptor", and a reference to this is stored in each MemoryAccess. Differential Revision: https://reviews.llvm.org/D32639 llvm-svn: 302653	2017-05-10 13:11:20 +00:00
Siddharth Bhat	8ac5340a4e	[GPUJIT] Disabled gcc's -Wpedantic for use of dlsym GCC's ISO C standard does not strictly define the bahavior of converting a `void*` pointer to a function pointer, but dlsym's POSIX standard does. The retrieval of function pointers through dlsym in this case generates an unnecessary amount of warnings for every API function assignment, bloating the output. This patch removes GCC's `-Wpedantic` flag for retrieval and assignment of these functions. This simplifies debugging the output of GPUJIT. Differential Revision: https://reviews.llvm.org/D33008 llvm-svn: 302638	2017-05-10 11:51:44 +00:00
Tobias Grosser	f3adab4c20	[Polly] Canonicalize arrays according to base-ptr equivalence class Summary: In case two arrays share base pointers in the same invariant load equivalence class, we canonicalize all memory accesses to the first of these arrays (according to their order in the equivalence class). This enables us to optimize kernels such as boost::ublas by ensuring that different references to the C array are interpreted as accesses to the same array. Before this change the runtime alias check for ublas would fail, as it would assume models of the C array with differing (but identically valued) base pointers would reference distinct regions of memory whereas the referenced memory regions were indeed identical. As part of this change we remove most of the MemoryAccess::getBaseAddr interface. We removed already all references to getBaseAddr in previous commits to ensure that no code relies on matching base pointers between memory accesses and scop arrays -- except for three remaining uses where we need the original base pointer. We document for these situations that MemoryAccess::getOriginalBaseAddr may return a base pointer that is distinct to the base pointer of the scop array referenced by this memory access. Reviewers: sebpop, Meinersbur, zinob, gareevroman, pollydev, huihuiz, efriedma, jdoerfert Reviewed By: Meinersbur Subscribers: etherzhhb Tags: #polly Differential Revision: https://reviews.llvm.org/D28518 llvm-svn: 302636	2017-05-10 10:59:58 +00:00
Tobias Grosser	0f7ce83018	Add noreturn attribute to avoid warnings about missing initialization Before this change we saw warnings such as: tools/GPURuntime/GPUJIT.c:1566:3: warning: variable 'DevPtr' is used uninitialized whenever switch default is taken [-Wsometimes-uninitialized] default: llvm-svn: 302621	2017-05-10 05:20:56 +00:00
Tobias Grosser	1a2e0e6415	Fix formatting in Polly llvm-svn: 302620	2017-05-10 04:53:59 +00:00
Chandler Carruth	d742e5efa8	Update Polly for LLVM API change r302571 that removed varargs functions with a nullptr sentinel in favor of nicely typed variadic templates. llvm-svn: 302618	2017-05-10 02:39:35 +00:00
Siddharth Bhat	a90be207c6	[Polly][PPCGCodeGen] OpenCL now gets kernel argument size from PPCG CodeGen Summary: PPCGCodeGeneration now attaches the size of the kernel launch parameters at the end of the parameter list. For the existing CUDA Runtime, this gets ignored, but the OpenCL Runtime knows to check for kernel-argument size at the end of the parameter list. (The resulting parameters list is twice as long. This has been accounted for in the corresponding test cases). Reviewers: grosser, Meinersbur, bollu Reviewed By: bollu Subscribers: nemanjai, yaxunl, Anastasia, pollydev, llvm-commits Tags: #polly Differential Revision: https://reviews.llvm.org/D32961 llvm-svn: 302515	2017-05-09 10:45:52 +00:00
Siddharth Bhat	0c8dcfd743	[Polly][GPUJIT] Fixed OpenCL 2.0 min requirement for Error codes Summary: Removed OpenCL error code identifiers introduced in version 2.0. Reviewers: grosser, bollu Reviewed By: bollu Subscribers: yaxunl, Anastasia, pollydev, llvm-commits Tags: #polly Differential Revision: https://reviews.llvm.org/D32962 llvm-svn: 302423	2017-05-08 14:10:37 +00:00
Siddharth Bhat	17f01968f1	[Polly] Added OpenCL Runtime to GPURuntime Library for GPGPU CodeGen Summary: When compiling for GPU, one can now choose to compile for OpenCL or CUDA, with the corresponding polly-gpu-runtime flag (libopencl / libcudart). The GPURuntime library (GPUJIT) has been extended with the OpenCL Runtime library for that purpose, correctly choosing the corresponding library calls to the option chosen when compiling (via different initialization calls). Additionally, a specific GPU Target architecture can now be chosen with -polly-gpu-arch (only nvptx64 implemented thus far). Reviewers: grosser, bollu, Meinersbur, etherzhhb, singam-sanjay Reviewed By: grosser, Meinersbur Subscribers: singam-sanjay, llvm-commits, pollydev, nemanjai, mgorny, yaxunl, Anastasia Tags: #polly Differential Revision: https://reviews.llvm.org/D32431 llvm-svn: 302379	2017-05-07 21:03:46 +00:00
Siddharth Bhat	5cf77125fc	[Polly] [GPUJIT] Adapted argument capitalization to fit standard Summary: Function argument naming changed to reflect capitalization standards. Reviewers: grosser, Meinersbur Reviewed By: grosser Differential Revision: https://reviews.llvm.org/D32854 llvm-svn: 302376	2017-05-07 19:53:35 +00:00
Siddharth Bhat	448b8079cc	[Polly] [GPUJIT] Moved error prints to stderr Summary: Errors previously printed to stdout now get printed to stderr. Reviewers: grosser, Meinersbur Reviewed By: grosser Differential Revision: https://reviews.llvm.org/D32852 llvm-svn: 302375	2017-05-07 18:31:25 +00:00
Tobias Grosser	c6ad42165f	Really disable test as intended in the previous commit llvm-svn: 302360	2017-05-06 19:18:19 +00:00
Tobias Grosser	0f4e94673d	Disable test to avoid buildbot noise This test was introduced in r302339. It works on my system, but breaks on the buildbots. llvm-svn: 302358	2017-05-06 18:50:28 +00:00
Michael Kruse	5ae08c0ebb	[DeLICM] Known knowledge. Extend the Knowledge class to store information about the contents of array elements and which values are written. Two knowledges do not conflict the known content is the same. The content information if computed from writes to and loads from the array elements, and represented by "ValInst": isl spaces that compare equal if the value represented is the same. Differential Revision: https://reviews.llvm.org/D31247 llvm-svn: 302339	2017-05-06 14:03:58 +00:00
Michael Kruse	2a8f6f843f	[CMake] Introduce POLLY_BUNDLED_JSONCPP. Allow using a system's install jsoncpp library instead of the bundled one with the setting POLLY_BUNDLED_JSONCPP=OFF. This fixes llvm.org/PR32929 Differential Revision: https://reviews.llvm.org/D32922 llvm-svn: 302336	2017-05-06 13:42:15 +00:00
Michael Kruse	391a2ac09b	[ScopBuilder] Move Scop::init to ScopBuilder. NFC. Scop::init is used only during SCoP construction. Therefore ScopBuilder seems the more appropriate place for it. We integrate it onto its only caller ScopBuilder::buildScop where some other construction steps already took place. Differential Revision: https://reviews.llvm.org/D32908 llvm-svn: 302276	2017-05-05 20:09:08 +00:00
Tobias Grosser	c1ddedc657	Fix typo llvm-svn: 302244	2017-05-05 15:46:01 +00:00
Michael Kruse	f1052ceb5e	[ScopBuilder] Do not verify unfeasible SCoPs. SCoPs with unfeasible runtime context are thrown away and therefore do not need their uses verified. The added test case requires a complexity limit to exceed. Normally, error statements are removed from the SCoP and for that reason are skipped during the verification. If there is a unfeasible runtime context (here: because of the complexity limit being reached), the removal of error statements and other SCoP construction steps are skipped to not waste time. Error statements are not modeled in SCoPs and therefore have no requirements on whether the scalars used in them are available. llvm-svn: 302234	2017-05-05 13:38:35 +00:00
Tobias Grosser	d5727c5011	Fix handling of signWrappedSets in access relations Since r294891, in MemoryAccess::computeBoundsOnAccessRelation(), we skip manually bounding the access relation in case the parameter of the load instruction is already a wrapped set. Later on we assume that the lower bound on the set is always smaller or equal to the upper bound on the set. Bug 32715 manages to construct a sign wrapped set, in which case the assertion does not necessarily hold. Fix this by handling a sign wrapped set similar to a normal wrapped set, that is skipping the computation. Contributed-by: Maximilian Falkenstein <falkensm@student.ethz.ch> Reviewers: grosser Subscribers: pollydev, llvm-commits Tags: #Polly Differential Revision: https://reviews.llvm.org/D32893 llvm-svn: 302231	2017-05-05 13:20:47 +00:00
Siddharth Bhat	c1267b9baa	Revert "[Polly] Added OpenCL Runtime to GPURuntime Library for GPGPU CodeGen" This reverts commit 17a84e414adb51ee375d14836d4c2a817b191933. Patches should have been submitted in the order of: 1. D32852 2. D32854 3. D32431 I mistakenly pushed D32431(3) first. Reverting to push in the correct order. llvm-svn: 302217	2017-05-05 09:02:08 +00:00
Siddharth Bhat	51904ae35a	[Polly] Added OpenCL Runtime to GPURuntime Library for GPGPU CodeGen Summary: When compiling for GPU, one can now choose to compile for OpenCL or CUDA, with the corresponding polly-gpu-runtime flag (libopencl / libcudart). The GPURuntime library (GPUJIT) has been extended with the OpenCL Runtime library for that purpose, correctly choosing the corresponding library calls to the option chosen when compiling (via different initialization calls). Additionally, a specific GPU Target architecture can now be chosen with -polly-gpu-arch (only nvptx64 implemented thus far). Reviewers: grosser, bollu, Meinersbur, etherzhhb, singam-sanjay Reviewed By: grosser, Meinersbur Subscribers: singam-sanjay, llvm-commits, pollydev, nemanjai, mgorny, yaxunl, Anastasia Tags: #polly Differential Revision: https://reviews.llvm.org/D32431 llvm-svn: 302215	2017-05-05 07:54:49 +00:00
Michael Kruse	704c03e03b	[ScopBuilder] Add missing semicolon after LLVM_FALLTHROUGH. It was forgotten in r302157. llvm-svn: 302163	2017-05-04 15:55:54 +00:00
Michael Kruse	eedae7630a	Introduce VirtualUse. NFC. If a ScopStmt references a (scalar) value, there are multiple possibilities where this value can come. The decision about what kind of use it is must be handled consistently at different places, which can be error-prone. VirtualUse is meant to centralize the handling of the different types of value uses. This patch makes ScopBuilder and CodeGeneration use VirtualUse. This already helps to show inconsistencies with the value handling. In order to keep this patch NFC, exceptions to the general rules are added. These might be fixed later if they turn to problems. Overall, this should result in fewer post-codegen IR-verification errors, but instead assertion failures in `getNewValue` that are closer to the actual error. Differential Revision: https://reviews.llvm.org/D32667 llvm-svn: 302157	2017-05-04 15:22:57 +00:00
Michael Kruse	45d5cf47bf	[CMake] Remove POLLY_TEST_DIRECTORIES. The test subdirectory POLLY_TEST_DIRECTORIES was heavily outdated and only used in out-of-LLVM-tree builds (to generate polly-test-${subdir} targets). llvm-svn: 302142	2017-05-04 12:21:25 +00:00
Tobias Grosser	3f25a7e8ee	[ScopDetection] Check for already known required-invariant loads [NFC] For certain test cases we spent over 50% of the scop detection time in checking if a load is likely invariant. We can avoid most of these checks by testing early on if a load is expected to be invariant. Doing this reduces scop-detection time on a large benchmark from 52 seconds to just 25 seconds. No functional change is expected. llvm-svn: 302134	2017-05-04 10:16:20 +00:00
Tobias Grosser	1859463876	Adjust test case to not trigger the SCEV optimization committed in r302096 This makes sure we still test the case that a PHI-NODE cannot be analyzed by scalar evolution and consequently must be code generated explicitly. As Michael's optimization triggers only on a very specific "add %iv, %step" pattern, just changing 'add' to 'mul' adds back test coverage. llvm-svn: 302132	2017-05-04 08:56:54 +00:00
Tobias Grosser	e2ccc3fb33	[ScopInfo] Do not use LLVM names to identify statements, arrays, and parameters LLVM-IR names are commonly available in debug builds, but often not in release builds. Hence, using LLVM-IR names to identify statements or memory reference results makes the behavior of Polly depend on the compile mode. This is undesirable. Hence, we now just number the statements instead of using LLVM-IR names to identify them (this issue has previously been brought up by Zino Benaissa). However, as LLVM-IR names help in making test cases more readable, we add an option '-polly-use-llvm-names' to still use LLVM-IR names. This flag is by default set in the polly tests to make test cases more readable. This change reduces the time in ScopInfo from 32 seconds to 2 seconds for the following test case provided by Eli Friedman <efriedma@codeaurora.org> (already used in one of the previous commits): struct X { int x; }; void a(); #define SIG (int x, X y, X z) typedef void (fn)SIG; #define FN { for (int i = 0; i < x; ++i) { (y)[i].x += (*z)[i].x; } a(); } #define FN5 FN FN FN FN FN #define FN25 FN5 FN5 FN5 FN5 #define FN125 FN25 FN25 FN25 FN25 FN25 #define FN250 FN125 FN125 #define FN1250 FN250 FN250 FN250 FN250 FN250 void x SIG { FN1250 } For a larger benchmark I have on-hand (10000 loops), this reduces the time for running -polly-scops from 5 minutes to 4 minutes, a reduction by 20%. The reason for this large speedup is that our previous use of printAsOperand had a quadratic cost, as for each printed and unnamed operand the full function was scanned to find the instruction number that identifies the operand. We do not need to adjust the way memory reference ids are constructured, as they do not use LLVM values. Reviewed by: efriedma Tags: #polly Differential Revision: https://reviews.llvm.org/D32789 llvm-svn: 302072	2017-05-03 20:08:52 +00:00
Siddharth Bhat	88619946b6	[CUDA Managed Memory] Fix regression introduced by Managed Memory - Fixes breakage from commit 5536f. - Interference with commit 764f3 caused testcase to fail. Reverting 764f3 allows commit 5536f to succeed. - Generated kernel code was slightly different due to 764f3, which caused testcase to fail. llvm-svn: 302021	2017-05-03 13:15:27 +00:00
Tobias Grosser	72684bbaf5	[ScopInfo] Remove code not needed anymore after r302004 llvm-svn: 302005	2017-05-03 08:02:32 +00:00
Tobias Grosser	8133128c17	[ScopInfo] Do not add array name into memory reference ids Before this change a memory reference identifier had the form: <STMT>_<ACCESSTYPE><ID>_<MEMREF>, e.g., Stmt_bb9_Write0_MemRef_tmp11 After this change, we use the format: <STMT>_<ACCESSTYPE><ID>, e.g., Stmt_bb9_Write0 The name of the array that is accessed through a memory reference is not necessary to uniquely identify a memory reference, but was only added to provide additional information for debugging. We drop this information now for the following two reasons: 1) This shortens the names and consequently improves readability 2) This removes a second location where we decide on the name of a scop array, leaving us only with the location where the actual scop array is created. Having after 2) only a single location to name scop arrays will allow us to change the naming convention of scop arrays more easily, which we will do in a future commit to reduce compilation time. llvm-svn: 302004	2017-05-03 07:57:35 +00:00
Siddharth Bhat	6c3d19ba45	[NFC] [IslAST] fix typo: "int the" -> "in the" llvm-svn: 301925	2017-05-02 14:54:49 +00:00
Michael Kruse	ecbd57e98a	[CMake] Move PollyCore to Polly project folder. This keeps the artifacts consistently structured in the "Polly" folder of Visual Studio solutions. llvm-svn: 301779	2017-04-30 21:07:05 +00:00
Hongbin Zheng	e9a9932712	[Polly] Make PollyCore depends on intrinsics_gen llvm-svn: 301734	2017-04-29 03:12:17 +00:00
Tobias Grosser	3d76f2ccd3	[tests] Ensure all test cases use named variables This makes it easier to read and possibly even modify the test cases, as there is no need to keep the variable increment in steps of one. More importantly, by using explicit variable names we do not need to rely on the implicit numbering of statements when dumping the scop information. This makes it easier to read and possibly even modify the test cases. Furthermore, by using explicit variables we do not need to rely on the implicit numbering of statements when dumping the scop information. In a future commit, this implicit numbering will likely not be used any more to refer to LLVM-IR values as it is very expensive to construct. llvm-svn: 301689	2017-04-28 21:16:29 +00:00
Tobias Grosser	f13722177b	[Codegen] Disable Polly's codegen verification by default As has been reported in the previous commit, codegen verification can result in quadratic compile time increases for large functions with many scops. This is certainly not something we would like to have in the Polly default configuration. Hence, we disable codegen verification by default -- also to see if this resolves some of the compilation timeouts we currently see on the AOSP buildbots. We still leave this feature in Polly as it has shown _very_ useful for debugging. In fact, we may want to have a discussion if we can bring this feature back in a way that does not impact compilation time so much. Thanks to Eli Friedman <efriedma@codeaurora.org> for reporting this issue and for providing the test case in the previous commit (where I forgot to acknowledge him). llvm-svn: 301670	2017-04-28 19:15:28 +00:00
Tobias Grosser	d439911f73	[CodeGen] Skip verify if -polly-codegen-verify is set to false Before this change, we always tried to verify the function and printed verification errors, but just did not abort in case -polly-codegen-verify=false was set and verification failed. As verification can become very cosly -- for large functions with many scops we may verify the very same function very often -- this can affect compile time very negatively. Hence, we respect the -polly-codegen-verify flag with this check, ensuring that no verification is run if -polly-codegen-verify=false. This reduces code generation time from 26 seconds to 4 seconds on the test case below with -polly-codegen-verify=false: struct X { int x; }; void a(); #define SIG (int x, X y, X z) typedef void (fn)SIG; #define FN { for (int i = 0; i < x; ++i) { (y)[i].x += (*z)[i].x; } a(); } #define FN5 FN FN FN FN FN #define FN25 FN5 FN5 FN5 FN5 #define FN125 FN25 FN25 FN25 FN25 FN25 #define FN250 FN125 FN125 #define FN1250 FN250 FN250 FN250 FN250 FN250 void x SIG { FN1250 } llvm-svn: 301669	2017-04-28 19:08:20 +00:00
Siddharth Bhat	abed49699b	[Polly] [PPCGCodeGeneration] Add managed memory support to GPU code generation. This needs changes to GPURuntime to expose synchronization between host and device. 1. Needs better function naming, I want a better name than "getOrCreateManagedDeviceArray" 2. DeviceAllocations is used by both the managed memory and the non-managed memory path. This exploits the fact that the two code paths are never run together. I'm not sure if this is the best design decision Reviewed by: PhilippSchaad Tags: #polly Differential Revision: https://reviews.llvm.org/D32215 llvm-svn: 301640	2017-04-28 11:16:30 +00:00
Tobias Grosser	287942ae82	Update to isl-0.18-592-gb50ad59 This is just a general maintenance update. llvm-svn: 301624	2017-04-28 06:11:17 +00:00
Tobias Grosser	c96c1d8c87	[ScopInfo] Consider only write-free dereferencable loads as invariant When we introduced in r297375 support for hoisting loads that are known to be dereferencable without any conditional guard, we forgot to keep the check to verify that no other write into the very same location exists. This change ensures now that dereferencable loads are allowed to access everything, but can only be hoisted in case no conflicting write exists. This resolves llvm.org/PR32778 Reported-by: Huihui Zhang <huihuiz@codeaurora.org> llvm-svn: 301582	2017-04-27 20:08:16 +00:00
Michael Kruse	792a6fcc57	[CMake] Use object library to build the two flavours of Polly. Polly comes in two library flavors: One loadable module to use the LLVM framework -load mechanism, and another one that host applications can link to. These have very different requirements for Polly's own dependencies. The loadable module assumes that all its LLVM dependencies are already available in the address space of the host application, and is not allowed to bring in its own copy of any LLVM library (including the NVPTX backend in case of Polly-ACC). The non-module library is intended to be linked to using target_link_libraries. CMake would then resolve all of its dependencies, including NVPTX and ensure that only a single instance of each library will be used. Differential Revision: https://reviews.llvm.org/D32442 llvm-svn: 301558	2017-04-27 16:13:03 +00:00
Philip Pfaffe	5d790fc03c	[Polly][Cmake] Add missing include paths to exported cmake config llvm-svn: 301552	2017-04-27 16:03:42 +00:00
Hongbin Zheng	0f8f177682	[Polly] Do not introduce address space cast Do not introduce address space cast in IslNodeBuilder::preloadUnconditionally. Differential Revision: https://reviews.llvm.org/D32581 llvm-svn: 301519	2017-04-27 06:42:14 +00:00
Michael Kruse	e6d2bebb25	[unittests/DeLICM] Add test for Written vs Written. The interpretation of multiple known ValInsts for the same element and timepoint is that these are alterntivate names for the same values, for instance a PHINode and the incoming value when knowning it was the last executed block. That means that known values do not conflict if there at least (but necessarily all) one common ValInst. This prinviple also applies to Written values. Add a test for this principle. llvm-svn: 301481	2017-04-26 21:52:55 +00:00
Michael Kruse	8080011ca1	[unittests/DeLICM] Add test for Occipied vs Occupied. The interpretation of multiple known ValInsts for the same element and timepoint is that these are alterntivate names for the same values, for instance a PHINode and the incoming value when knowning it was the last executed block. That means that known values do not conflict if there at least (but necessarily all) one common ValInst. Add a case to test this principle. llvm-svn: 301480	2017-04-26 21:52:51 +00:00
Michael Kruse	3e519b949b	[DeLICM] Use Known information when comparing Occupied and Written. Do not conflict if a write writes the same value as already known. This change only affects unit tests, but no functional changes are expected on LLVM-IR, as no Known information is yet extracted and consequently this functionality is only triggered through unit tests. Differential Revision: https://reviews.llvm.org/D32026 llvm-svn: 301460	2017-04-26 20:35:07 +00:00
Tobias Grosser	1c3eebac08	Update to isl-0.18-423-g30331fe This is just a general maintenance update. llvm-svn: 301433	2017-04-26 17:08:02 +00:00
Michael Kruse	cd2be66bf0	[DeLICM] Use Known information when comparing Existing.Occupied and Proposed.Occupied. Do not conflict if the value of Existing and Proposed are the same. This change only affects unit tests, but no functional changes are expected on LLVM-IR, as no Known information is yet extracted and consequently this functionality is only triggered through unit tests. Differential Revision: https://reviews.llvm.org/D32025 llvm-svn: 301301	2017-04-25 10:57:32 +00:00
Siddharth Bhat	d277feda91	[PPCGCodeGeneration] Update PPCG Code Generation for OpenCL compatibility Added a small change to the way pointer arguments are set in the kernel code generation. The way the pointer is retrieved now, specifically requests global address space to be annotated. This is necessary, if the IR should be run through NVPTX to generate OpenCL compatible PTX. The changes do not affect the PTX Strings generated for the CUDA target (nvptx64-nvidia-cuda), but are necessary for OpenCL (nvptx64-nvidia-nvcl). Additionally, the data layout has been updated to what the NVPTX Backend requests/recommends. Contributed-by: Philipp Schaad Reviewers: Meinersbur, grosser, bollu Reviewed By: grosser, bollu Subscribers: jlebar, pollydev, llvm-commits, nemanjai, yaxunl, Anastasia Tags: #polly Differential Revision: https://reviews.llvm.org/D32215 llvm-svn: 301299	2017-04-25 08:08:29 +00:00
Michael Kruse	a8b0be819a	[unittests] Derive Occupied from Unused when given. When both, OccupiedAndKnown and Unused are given, use the former only for the Known values. The relation Unused \union Occupied must always hold. This allows us to specify Known independently of Occupied. It is needed for an artificial test case in https://reviews.llvm.org/D32025. llvm-svn: 301284	2017-04-25 00:30:42 +00:00
Michael Kruse	b745b740f9	[unittests] Add postcondition to completeLifetime. llvm-svn: 301283	2017-04-25 00:30:32 +00:00
Siddharth Bhat	729377f063	[Polly] [DependenceInfo] change WAR generation, Read will not block Read Earlier, the call to buildFlow was: WAR = buildFlow(Write, Read, MustWrite, Schedule). This meant that Read could block another Read, since must-sources can block each other. Fixed the call to buildFlow to correctly compute Read. The resulting code needs to do some ISL juggling to get the output we want. Bug report: https://bugs.llvm.org/show_bug.cgi?id=32623 Reviewers: Meinersbur Tags: #polly Differential Revision: https://reviews.llvm.org/D32011 llvm-svn: 301266	2017-04-24 22:23:12 +00:00
Tobias Grosser	9b34a08b19	[isl C++ bindings] Add explicit const casts for foreach bindings This avoids a compiler warning about lost 'const' attributes. Suggested-by: Michael Kruse <llvm@meinersbur.de> llvm-svn: 301108	2017-04-23 07:54:12 +00:00
Michael Kruse	abf05b18db	[CMake] Fix polly-isl-test execution in out-of-LLVM-tree builds. The isl unittest modified its PATH variable to point to the LLVM bin dir. When building out-of-LLVM-tree, it does not contain the polly-isl-test executable, hence the test fails. Ensure that the polly-isl-test is written to a bin directory in the build root, just like it would happen in an inside-LLVM build. Then, change PATH to include that dir such that the executable in it is prioritized before any other location. llvm-svn: 301096	2017-04-22 23:02:53 +00:00
Michael Kruse	9c19d1f3aa	[CMake] Fix unittests in out-of-LLVM-tree builds. Unittests are linked against a subset of LLVM libraries and its transitive dependencies resolved by CMake. The information about indirect library dependency is not available when building separately from LLVM, which result in missing symbol errors while linking. Resolve this issue by querying llvm-config about the available LLVM libraries and link against all of them, since dependence information is still not available. llvm-svn: 301095	2017-04-22 23:02:46 +00:00
Michael Kruse	ab6b47d2e7	[CMake] Link unittests only against libLLVM.so, if available. We can only link against libLLVM.so or the individual libLLVM*.so components, but not both of them. Doing so results in these components exist twice in the programs address space, since it is already contained in libLLVM.so. The observable effect of this is that command line switches are registered multiple times (once for each instance), which is an error. This fixes llvm.org/PR32735. Reported-by: Singapuram Sanjay Srivallabh <singapuram.sanjay@gmail.com> llvm-svn: 301020	2017-04-21 19:03:51 +00:00
Tobias Grosser	9e6c00194f	GICHelper: remove forgotten isl foreach declarations These should have been dropped in r300323. Suggested-by: Michael Kruse <llvm@meinersbur.de> llvm-svn: 300965	2017-04-21 10:50:33 +00:00
Michael Kruse	8431e996d3	[DeLICM] Use Known information when comparing Existing.Written and Proposed.Written. This change only affects unit tests, but no functional changes are expected on LLVM-IR, as no Known information is yet extracted and consequently this functionality is only triggered through unit tests. Differential Revision: https://reviews.llvm.org/D32027 llvm-svn: 300874	2017-04-20 19:16:39 +00:00
Tobias Grosser	1f8b84094f	Update isl bindings to latest version (+ Polly extensions) After the isl C++ binding generator is now close to being upstreamed to isl, we synchronize the latest changes to Polly. These are mostly formatting changes plus a small interface change for the foreach callback function and some naming changes in isl::boolean. llvm-svn: 300398	2017-04-15 08:15:54 +00:00
Tobias Grosser	75aa1a9a49	Use isl C++ foreach implementation This commit switches Polly over to the isl::obj::foreach_* implementation, which is part of the new isl bindings and follows the foreach pattern established in Polly by Michael Kruse. The original isl C function: isl_stat isl_union_set_foreach_set(__isl_keep isl_union_set uset, isl_stat (fn)(__isl_take isl_set set, void user), void user); which required the user to define a static callback function to which all interesting parameters are passed via a 'void ' user-pointer, is on the C++ side available as a function that takes a std::function<>, which can carry any additional arguments without the need for a user pointer: stat UnionSet::foreach_set(const std::function<stat(set)> &fn) const; The following code illustrates the use of the new C++ interface: auto Lambda = [=, &Result](isl::set Set) -> isl::stat { auto Shifted = shiftDimension(Set, Pos, Amount); Result = Result.add(Shifted); return isl::stat::ok; } UnionSet.foreach_set(Lambda); Polly had some specialized foreach functions which did not require the lambdas to return a status flag. We remove these functions in this commit to move Polly completely over to the new isl interface. We may in the future discuss if functors without return values can be supported easily. Another extension proposed by Michael Kruse is the use of C++ iterators to allow the use of normal for loops to iterate over these sets. Such an extension would allow us to further simplify the code. Reviewed-by: Michael Kruse <llvm@meinersbur.de> Differential Revision: https://reviews.llvm.org/D30620 llvm-svn: 300323	2017-04-14 13:39:40 +00:00
Michael Kruse	a8e885d87c	[DeLICM] Introduce unittesting infrastructure for Known and Written. NFC. llvm-svn: 300212	2017-04-13 16:32:46 +00:00
Michael Kruse	72f3922534	[DeLICM] Export Known and Written to DeLICMTests. NFC. This will allow unittesting of new functionality based on Known and Written. llvm-svn: 300211	2017-04-13 16:32:39 +00:00
Michael Kruse	a2acc11949	[DeLICM] Add Knowledge::Known. NFC. This field will later contain a ValInst that is known to be stored in an occupied array element. llvm-svn: 300210	2017-04-13 16:32:31 +00:00
Michael Kruse	fa7c8cdfc6	[DeLICM] Make Knowledge::Written an isl::union_map. NFC. The map will later point to a ValInst that is written. llvm-svn: 300208	2017-04-13 16:32:25 +00:00
Michael Kruse	5e6456979b	[DeLICM] Rename Knowledge to KnowledgeStr. NFC. Some debuggers get confused by different class of the same name defined independently in different translation units. llvm-svn: 300207	2017-04-13 16:32:16 +00:00
Tobias Grosser	7b5a4dfd46	Exploit BasicBlock::getModule to shorten code Suggested-by: Roman Gareev <gareevroman@gmail.com> llvm-svn: 299914	2017-04-11 04:59:13 +00:00
Tobias Grosser	67726b3260	SAdjust to recent change in constructor definition of AllocaInst llvm-svn: 299913	2017-04-11 04:23:38 +00:00
Matt Arsenault	b3e30c32ce	Update for alloca construction changes llvm-svn: 299905	2017-04-11 00:12:58 +00:00
Philip Pfaffe	78265cd237	Fix missing .git/indexloadPolly in ensure-correct-tile-sizes testcase llvm-svn: 299765	2017-04-07 12:55:26 +00:00
Roman Gareev	9d4d91ca6a	[FIX] Fix ScheduleTreeOptimizer::optimizeMatMulPattern Use new values of the dimensions during their permutation. llvm-svn: 299663	2017-04-06 17:25:08 +00:00
Roman Gareev	e0d466342b	Restore the initial ordering of dimensions before applying the pattern matching Dimensions of band nodes can be implicitly permuted by the algorithm applied during the schedule generation. For example, in case of the following matrix-matrix multiplication, for (i = 0; i < 1024; i++) for (k = 0; k < 1024; k++) for (j = 0; j < 1024; j++) C[i][j] += A[i][k] * B[k][j]; it can produce the following schedule tree domain: "{ Stmt_for_body6[i0, i1, i2] : 0 <= i0 <= 1023 and 0 <= i1 <= 1023 and 0 <= i2 <= 1023 }" child: schedule: "[{ Stmt_for_body6[i0, i1, i2] -> [(i0)] }, { Stmt_for_body6[i0, i1, i2] -> [(i1)] }, { Stmt_for_body6[i0, i1, i2] -> [(i2)] }]" permutable: 1 coincident: [ 1, 1, 0 ] The current implementation of the pattern matching optimizations relies on the initial ordering of dimensions. Otherwise, it can produce the miscompilation (e.g., [1]). This patch helps to restore the initial ordering of dimensions by recreating the band node when the corresponding conditions are satisfied. Refs.: [1] - https://bugs.llvm.org/show_bug.cgi?id=32500 Reviewed-by: Michael Kruse <llvm@meinersbur.de> Differential Revision: https://reviews.llvm.org/D31741 llvm-svn: 299662	2017-04-06 17:09:54 +00:00
Siddharth Bhat	5eeb1dd42e	[Polly] [ScheduleOptimizer] Prevent incorrect tile size computation Because Polly exposes parameters that directly influence tile size calculations, one can setup situations like divide-by-zero. Check against a possible divide-by-zero in getMacroKernelParams and return early. Also assert at the end of getMacroKernelParams that the block sizes computed for matrices are positive (>= 1). Tags: #polly Differential Revision: https://reviews.llvm.org/D31708 llvm-svn: 299633	2017-04-06 08:20:22 +00:00
Tobias Grosser	0d622a4bf9	Update to isl-0.18-417-gb9e7334 This is a regular maintenance update. llvm-svn: 299617	2017-04-06 03:41:47 +00:00
Michael Kruse	895f5d8080	Remove llvm.lifetime.start/end in original region. The current StackColoring algorithm does not correctly handle the situation when some, but not all paths from a BB to the entry node cross a llvm.lifetime.start. According to an interpretation of the language reference at http://llvm.org/docs/LangRef.html#llvm-lifetime-start-intrinsic this might be correct, but it would cost too much effort to handle in StackColoring. To be on the safe side, remove all lifetime markers even in the original code version (they have never been copied to the optimized version) to ensure that no path to the entry block will cross a llvm.lifetime.start. The same principle applies to paths the a function return and the llvm.lifetime.end marker, so we remove them as well. This fixes llvm.org/PR32251. Also see the discussion at http://lists.llvm.org/pipermail/llvm-dev/2017-March/111551.html llvm-svn: 299585	2017-04-05 20:09:59 +00:00
Tobias Grosser	59e42b8f96	Add two Polly images llvm-svn: 299534	2017-04-05 11:50:31 +00:00
Siddharth Bhat	bcbfdade41	[Polly] [DependenceInfo] change WAR, WAW generation to correct semantics = Change of WAR, WAW generation: = - `buildFlow(Sink, MustSource, MaySource, Sink)` treates any flow of the form `sink <- may source <- must source` as a may dependence. - we used to call: ```lang=cpp, name=old-flow-call.cpp Flow = buildFlow(MustWrite, MustWrite, Read, Schedule); WAW = isl_union_flow_get_must_dependence(Flow); WAR = isl_union_flow_get_may_dependence(Flow); ``` - This caused some WAW dependences to be treated as WAR dependences. - Incorrect semantics. - Now, we call WAR and WAW correctly. == Correct WAW: == ```lang=cpp, name=new-waw-call.cpp Flow = buildFlow(Write, MustWrite, MayWrite, Schedule); WAW = isl_union_flow_get_may_dependence(Flow); isl_union_flow_free(Flow); ``` == Correct WAR: == ```lang=cpp, name=new-war-call.cpp Flow = buildFlow(Write, Read, MustaWrite, Schedule); WAR = isl_union_flow_get_must_dependence(Flow); isl_union_flow_free(Flow); ``` - We want the "shortest" WAR possible (exact dependences). - We mark all the must-writes as may-source, reads as must-souce. - Then, we ask for must dependence. - This removes all the reads that flow through a must-write before reaching a sink. - Note that we only block ealier writes with must-writes. This is intuitively correct, as we do not want may-writes to block must-writes. - Leaves us with direct (R -> W). - This affects reduction generation since RED is built using WAW and WAR. = New StrictWAW for Reductions: = - We used to call: ```lang=cpp,name=old-waw-war-call.cpp Flow = buildFlow(MustWrite, MustWrite, Read, Schedule); WAW = isl_union_flow_get_must_dependence(Flow); WAR = isl_union_flow_get_may_dependence(Flow); ``` - This is the right model of WAW we need for reductions, just not in general. - Reductions need to track only strict WAW, without any interfering reductions. = Explanation: Why the new WAR dependences in tests are correct: = - We no longer set WAR = WAR - WAW - Hence, we will have WAR dependences that were originally removed. - These may look incorrect, but in fact make sense. == Code: == ```lang=llvm, name=new-war-dependence.ll ; void manyreductions(long A) { ; for (long i = 0; i < 1024; i++) ; for (long j = 0; j < 1024; j++) ; S0: A += 42; ; ; for (long i = 0; i < 1024; i++) ; for (long j = 0; j < 1024; j++) ; S1: A += 42; ; ``` === WAR dependence: === { S0[1023, 1023] -> S1[0, 0] } - Between `S0[1023, 1023]` and `S1[0, 0]`, we will have the dependences: ```lang=cpp, name=dependence-incorrect, counterexample S0[1023, 1023]: -- tmp = A (load0)-- WAR 2 add = tmp + 42 \| -> A = add (store0) \| WAR 1 S1[0, 0]: \| tmp = A (load1) \| add = tmp + 42 \| A = add (store1)<- ``` - One may assume that WAR2 hides WAR1 (since store0 happens before store1). However, within a statement, Polly has no idea about the ordering of loads and stores. - Hence, according to Polly, the code may have looked like this: ```lang=cpp, name=dependence-correct S0[1023, 1023]: A = add (store0) tmp = A (load0) ---* add = A + 42 \| WAR 1 S1[0, 0]: \| tmp = A (load1) \| add = A + 42 \| A = add (store1) <-* ``` - So, Polly generates (correct) WAR dependences. It does not make sense to remove these dependences, since they are correct with respect to Polly's model. Reviewers: grosser, Meinersbur tags: #polly Differential revision: https://reviews.llvm.org/D31386 llvm-svn: 299429	2017-04-04 13:08:23 +00:00
Philip Pfaffe	447f175eb5	Fix formatting in LoopGenerators llvm-svn: 299424	2017-04-04 10:22:17 +00:00
Philip Pfaffe	2d950f36ee	[Polly][NewPM] Pull references to the legacy PM interface from utilities and helpers Summary: A couple of the utilities used to analyze or build IR make explicit use of the legacy PM on their interface, to access analysis results. This patch removes the legacy PM from the interface, and just passes the required results directly. This shouldn't introduce any function changes, although the API technically allowed to obtain two different analysis results before, one passed by reference and one through the PM. I don't believe that was ever intended, however. Reviewers: grosser, Meinersbur Reviewed By: grosser Subscribers: nemanjai, pollydev, llvm-commits Tags: #polly Differential Revision: https://reviews.llvm.org/D31653 llvm-svn: 299423	2017-04-04 10:01:53 +00:00
Tobias Grosser	637be04b77	[PerfMonitor] Use Intrinsics::getDeclaration Instead of creating the declaration ourselves, we obtain it directly from the LLVM intrinsic definitions. This addresses a post-review comment for r299359. Suggested-by: Hongzing Zheng <etherzhhb@gmail.com> llvm-svn: 299360	2017-04-03 15:23:08 +00:00
Tobias Grosser	65371af2e1	[CodeGen] Add Performance Monitor Add support for -polly-codegen-perf-monitoring. When performance monitoring is enabled, we emit performance monitoring code during code generation that prints after program exit statistics about the total number of cycles executed as well as the number of cycles spent in scops. This gives an estimate on how useful polyhedral optimizations might be for a given program. Example output: Polly runtime information ------------------------- Total: 783110081637 Scops: 663718949365 In the future, we might also add functionality to measure how much time is spent in optimized scops and how many cycles are spent in the fallback code. Reviewers: bollu,sebpop Tags: #polly Differential Revision: https://reviews.llvm.org/D31599 llvm-svn: 299359	2017-04-03 14:55:37 +00:00
Michael Kruse	0b8949e6ed	[test] Fix two testcases. NFC. Trivial fix for two testcases. When Polly isn't linked into opt, independent of whether it's built in-tree or not, these testcases forget to load the appropriate library. Contributed-by: Philip Pfaffe <philip.pfaffe@gmail.com> Differential Revision: https://reviews.llvm.org/D31596 llvm-svn: 299357	2017-04-03 12:37:10 +00:00
Michael Kruse	6e7854a560	[ScopInfo] Fix typos in option description. llvm-svn: 299356	2017-04-03 12:03:38 +00:00
Tobias Grosser	bd96c73a1a	Add test case for r299352. llvm-svn: 299353	2017-04-03 07:44:23 +00:00
Tobias Grosser	696a1ee99d	[PollyIRBuilder] Bound size of alias metadata No-alias metadata grows quadratic in the size of arrays involved, which can become very costly for large programs. This commit bounds the number of arrays for which we construct no-alias information to ten. This is conservatively correct, as we just provide less information to LLVM and speeds up the compile time of one of my internal test cases from 'does-not-terminate' to 'finishes-in-less-than-a-minute'. In the future we might try to be more clever here, but this change should provide a good baseline. llvm-svn: 299352	2017-04-03 07:42:50 +00:00
Tobias Grosser	af940ae280	Update to isl-0.18-410-gc253447 This is a regular maintenance update to ensure latest isl changes are tested in our buildbots. llvm-svn: 299350	2017-04-03 06:46:16 +00:00
Huihui Zhang	d6d6a3f2ee	revert test commit r299024 llvm-svn: 299026	2017-03-29 20:23:56 +00:00
Huihui Zhang	9d19e9d232	test commit, add blank line llvm-svn: 299024	2017-03-29 20:10:45 +00:00
Michael Kruse	c3e9c1442d	[ScopInfo] Introduce ScopStmt::contains(BB*). NFC. Provide an common way for testing if a statement contains something for region and block statements. First user is RegionGenerator::addOperandToPHI. Suggested-by: Tobias Grosser <tobias@grosser.es> llvm-svn: 298617	2017-03-23 16:12:21 +00:00
Tobias Grosser	1f7e7d3d93	Update to isl-0.18-402-ga30c537 This is a regular maintenance update. llvm-svn: 298595	2017-03-23 13:38:24 +00:00
Michael Kruse	9e4e7b467f	[DeLICM] Add const qualifiers. NFC. llvm-svn: 298546	2017-03-22 20:09:58 +00:00
Michael Kruse	174f483990	[Support] Add functions to ISLTools. Add shiftDim and convertZoneToTimepoints overloads for isl maps. Add distributeDomain, liftDomains and applyDomainRange functions. These are going to be used in https://reviews.llvm.org/D31247 (Add known array contents to Knowledge) llvm-svn: 298543	2017-03-22 19:31:06 +00:00
Michael Kruse	d07d155ebb	[DeLICM] Remove overloaded Knowledge constructor. NFC. The isl C++ bindings now has implicit conversions from isl::set to isl::union_set. Therefore the additional overload accepting isl::set is not required anymore. llvm-svn: 298529	2017-03-22 18:01:23 +00:00
Michael Kruse	29143ec3f7	[DeLICM] Remove AllElements. NFC. It is not used and will not be used (anymore) in future commits. llvm-svn: 298522	2017-03-22 17:18:39 +00:00
Roman Gareev	cdfb57dc46	Introduce another level of metadata to distinguish non-aliasing accesses Introduce another level of alias metadata to distinguish the individual non-aliasing accesses that have inter iteration alias-free base pointers marked with "Inter iteration alias-free" mark nodes. It can be used to, for example, distinguish different stores (loads) produced by unrolling of the innermost loops and, subsequently, sink (hoist) them by LICM. Reviewed-by: Tobias Grosser <tobias@grosser.es> Differential Revision: https://reviews.llvm.org/D30606 llvm-svn: 298510	2017-03-22 14:25:24 +00:00
Roman Gareev	23df27682a	Map the new load to the base pointer of the invariant load hoisted load Map the new load to the base pointer of the invariant load hoisted load to be able to find the alias information for it. Reviewed-by: Tobias Grosser <tobias@grosser.es> Differential Revision: https://reviews.llvm.org/D30605 llvm-svn: 298507	2017-03-22 13:57:53 +00:00

1 2 3 4 5 ...

3241 Commits