llvm-project

Commit Graph

Author	SHA1	Message	Date
Michael Kruse	4aac59cee1	[Simplify] Mark variables as used. NFC. Mark variables as used that are needed in assertions. llvm-svn: 302725	2017-05-10 20:42:02 +00:00
Siddharth Bhat	f5c81fb199	[Fix][Fortran Support] Don't use -debug-only in pattern matching test cases -debug-only is unnecessary and causes the tests to break in Release mode. Remove the option to opt in the test cases. llvm-svn: 302722	2017-05-10 20:10:17 +00:00
Michael Kruse	f41f274bf8	[DeLICM] Avoid compiler warning. NFC. gcc 5.4 warns about using a C-style case to case away a const. Use case a const_cast instead. llvm-svn: 302715	2017-05-10 19:58:52 +00:00
Michael Kruse	f69a7c306b	[DeLICM] Always normalize domain. NFC. Some isl functions can simplify their __isl_keep arguments. The argument object after the call uses different contraints to represent the same set. Different contraints can result in different outputs when printed to a string. In assert builds additional isl functions are called (in assert() or mentioned, these can change the internal representation of its read-only arguments such that printed strings are different in debug and non-debug builds. What happened here is that a call to isl_set_is_equal inside an assert in getScatterFor normalizes one of its arguments such that one redundant constraint is removed. The redundant constraint therefore does not appear in the string representing the domain, which FileCheck notices as a regression test failure compared to a build with assertions disabled. This fix removes the redundant contraints the domain from the start such that the redundant contraint is removed in assert and non-assert builds. Isl adds a flag to such sets such that the removal of redundancies is not done multiple times (here: by isl_set_is_equal). Thanks to Tobias Grosser for reporting and hinting to the cause. llvm-svn: 302711	2017-05-10 19:50:45 +00:00
Siddharth Bhat	c47f039efd	[Fix] [Fortran Support] Fix variable name & make testcase activate on release There was: #ifdef NDEBUG This should be: #ifndef NDEBUG Also, the variable name was incorrect. Fixed the variable name. llvm-svn: 302696	2017-05-10 17:27:48 +00:00
Philip Pfaffe	d399607f65	[Polly][CMake] Fix syntactical errors in the exported config llvm-svn: 302657	2017-05-10 13:51:30 +00:00
Siddharth Bhat	f2dbba8183	[Fortran Support] Detect Fortran arrays & metadata from dragonegg output Add the ability to tag certain memory accesses as those belonging to Fortran arrays. We do this by pattern matching against known patterns of Dragonegg's LLVM IR output from Fortran code. Fortran arrays have metadata stored with them in a struct. This struct is called the "Fortran array descriptor", and a reference to this is stored in each MemoryAccess. Differential Revision: https://reviews.llvm.org/D32639 llvm-svn: 302653	2017-05-10 13:11:20 +00:00
Siddharth Bhat	8ac5340a4e	[GPUJIT] Disabled gcc's -Wpedantic for use of dlsym GCC's ISO C standard does not strictly define the bahavior of converting a `void*` pointer to a function pointer, but dlsym's POSIX standard does. The retrieval of function pointers through dlsym in this case generates an unnecessary amount of warnings for every API function assignment, bloating the output. This patch removes GCC's `-Wpedantic` flag for retrieval and assignment of these functions. This simplifies debugging the output of GPUJIT. Differential Revision: https://reviews.llvm.org/D33008 llvm-svn: 302638	2017-05-10 11:51:44 +00:00
Tobias Grosser	f3adab4c20	[Polly] Canonicalize arrays according to base-ptr equivalence class Summary: In case two arrays share base pointers in the same invariant load equivalence class, we canonicalize all memory accesses to the first of these arrays (according to their order in the equivalence class). This enables us to optimize kernels such as boost::ublas by ensuring that different references to the C array are interpreted as accesses to the same array. Before this change the runtime alias check for ublas would fail, as it would assume models of the C array with differing (but identically valued) base pointers would reference distinct regions of memory whereas the referenced memory regions were indeed identical. As part of this change we remove most of the MemoryAccess::getBaseAddr interface. We removed already all references to getBaseAddr in previous commits to ensure that no code relies on matching base pointers between memory accesses and scop arrays -- except for three remaining uses where we need the original base pointer. We document for these situations that MemoryAccess::getOriginalBaseAddr may return a base pointer that is distinct to the base pointer of the scop array referenced by this memory access. Reviewers: sebpop, Meinersbur, zinob, gareevroman, pollydev, huihuiz, efriedma, jdoerfert Reviewed By: Meinersbur Subscribers: etherzhhb Tags: #polly Differential Revision: https://reviews.llvm.org/D28518 llvm-svn: 302636	2017-05-10 10:59:58 +00:00
Tobias Grosser	0f7ce83018	Add noreturn attribute to avoid warnings about missing initialization Before this change we saw warnings such as: tools/GPURuntime/GPUJIT.c:1566:3: warning: variable 'DevPtr' is used uninitialized whenever switch default is taken [-Wsometimes-uninitialized] default: llvm-svn: 302621	2017-05-10 05:20:56 +00:00
Tobias Grosser	1a2e0e6415	Fix formatting in Polly llvm-svn: 302620	2017-05-10 04:53:59 +00:00
Chandler Carruth	d742e5efa8	Update Polly for LLVM API change r302571 that removed varargs functions with a nullptr sentinel in favor of nicely typed variadic templates. llvm-svn: 302618	2017-05-10 02:39:35 +00:00
Siddharth Bhat	a90be207c6	[Polly][PPCGCodeGen] OpenCL now gets kernel argument size from PPCG CodeGen Summary: PPCGCodeGeneration now attaches the size of the kernel launch parameters at the end of the parameter list. For the existing CUDA Runtime, this gets ignored, but the OpenCL Runtime knows to check for kernel-argument size at the end of the parameter list. (The resulting parameters list is twice as long. This has been accounted for in the corresponding test cases). Reviewers: grosser, Meinersbur, bollu Reviewed By: bollu Subscribers: nemanjai, yaxunl, Anastasia, pollydev, llvm-commits Tags: #polly Differential Revision: https://reviews.llvm.org/D32961 llvm-svn: 302515	2017-05-09 10:45:52 +00:00
Siddharth Bhat	0c8dcfd743	[Polly][GPUJIT] Fixed OpenCL 2.0 min requirement for Error codes Summary: Removed OpenCL error code identifiers introduced in version 2.0. Reviewers: grosser, bollu Reviewed By: bollu Subscribers: yaxunl, Anastasia, pollydev, llvm-commits Tags: #polly Differential Revision: https://reviews.llvm.org/D32962 llvm-svn: 302423	2017-05-08 14:10:37 +00:00
Siddharth Bhat	17f01968f1	[Polly] Added OpenCL Runtime to GPURuntime Library for GPGPU CodeGen Summary: When compiling for GPU, one can now choose to compile for OpenCL or CUDA, with the corresponding polly-gpu-runtime flag (libopencl / libcudart). The GPURuntime library (GPUJIT) has been extended with the OpenCL Runtime library for that purpose, correctly choosing the corresponding library calls to the option chosen when compiling (via different initialization calls). Additionally, a specific GPU Target architecture can now be chosen with -polly-gpu-arch (only nvptx64 implemented thus far). Reviewers: grosser, bollu, Meinersbur, etherzhhb, singam-sanjay Reviewed By: grosser, Meinersbur Subscribers: singam-sanjay, llvm-commits, pollydev, nemanjai, mgorny, yaxunl, Anastasia Tags: #polly Differential Revision: https://reviews.llvm.org/D32431 llvm-svn: 302379	2017-05-07 21:03:46 +00:00
Siddharth Bhat	5cf77125fc	[Polly] [GPUJIT] Adapted argument capitalization to fit standard Summary: Function argument naming changed to reflect capitalization standards. Reviewers: grosser, Meinersbur Reviewed By: grosser Differential Revision: https://reviews.llvm.org/D32854 llvm-svn: 302376	2017-05-07 19:53:35 +00:00
Siddharth Bhat	448b8079cc	[Polly] [GPUJIT] Moved error prints to stderr Summary: Errors previously printed to stdout now get printed to stderr. Reviewers: grosser, Meinersbur Reviewed By: grosser Differential Revision: https://reviews.llvm.org/D32852 llvm-svn: 302375	2017-05-07 18:31:25 +00:00
Tobias Grosser	c6ad42165f	Really disable test as intended in the previous commit llvm-svn: 302360	2017-05-06 19:18:19 +00:00
Tobias Grosser	0f4e94673d	Disable test to avoid buildbot noise This test was introduced in r302339. It works on my system, but breaks on the buildbots. llvm-svn: 302358	2017-05-06 18:50:28 +00:00
Michael Kruse	5ae08c0ebb	[DeLICM] Known knowledge. Extend the Knowledge class to store information about the contents of array elements and which values are written. Two knowledges do not conflict the known content is the same. The content information if computed from writes to and loads from the array elements, and represented by "ValInst": isl spaces that compare equal if the value represented is the same. Differential Revision: https://reviews.llvm.org/D31247 llvm-svn: 302339	2017-05-06 14:03:58 +00:00
Michael Kruse	2a8f6f843f	[CMake] Introduce POLLY_BUNDLED_JSONCPP. Allow using a system's install jsoncpp library instead of the bundled one with the setting POLLY_BUNDLED_JSONCPP=OFF. This fixes llvm.org/PR32929 Differential Revision: https://reviews.llvm.org/D32922 llvm-svn: 302336	2017-05-06 13:42:15 +00:00
Michael Kruse	391a2ac09b	[ScopBuilder] Move Scop::init to ScopBuilder. NFC. Scop::init is used only during SCoP construction. Therefore ScopBuilder seems the more appropriate place for it. We integrate it onto its only caller ScopBuilder::buildScop where some other construction steps already took place. Differential Revision: https://reviews.llvm.org/D32908 llvm-svn: 302276	2017-05-05 20:09:08 +00:00
Tobias Grosser	c1ddedc657	Fix typo llvm-svn: 302244	2017-05-05 15:46:01 +00:00
Michael Kruse	f1052ceb5e	[ScopBuilder] Do not verify unfeasible SCoPs. SCoPs with unfeasible runtime context are thrown away and therefore do not need their uses verified. The added test case requires a complexity limit to exceed. Normally, error statements are removed from the SCoP and for that reason are skipped during the verification. If there is a unfeasible runtime context (here: because of the complexity limit being reached), the removal of error statements and other SCoP construction steps are skipped to not waste time. Error statements are not modeled in SCoPs and therefore have no requirements on whether the scalars used in them are available. llvm-svn: 302234	2017-05-05 13:38:35 +00:00
Tobias Grosser	d5727c5011	Fix handling of signWrappedSets in access relations Since r294891, in MemoryAccess::computeBoundsOnAccessRelation(), we skip manually bounding the access relation in case the parameter of the load instruction is already a wrapped set. Later on we assume that the lower bound on the set is always smaller or equal to the upper bound on the set. Bug 32715 manages to construct a sign wrapped set, in which case the assertion does not necessarily hold. Fix this by handling a sign wrapped set similar to a normal wrapped set, that is skipping the computation. Contributed-by: Maximilian Falkenstein <falkensm@student.ethz.ch> Reviewers: grosser Subscribers: pollydev, llvm-commits Tags: #Polly Differential Revision: https://reviews.llvm.org/D32893 llvm-svn: 302231	2017-05-05 13:20:47 +00:00
Siddharth Bhat	c1267b9baa	Revert "[Polly] Added OpenCL Runtime to GPURuntime Library for GPGPU CodeGen" This reverts commit 17a84e414adb51ee375d14836d4c2a817b191933. Patches should have been submitted in the order of: 1. D32852 2. D32854 3. D32431 I mistakenly pushed D32431(3) first. Reverting to push in the correct order. llvm-svn: 302217	2017-05-05 09:02:08 +00:00
Siddharth Bhat	51904ae35a	[Polly] Added OpenCL Runtime to GPURuntime Library for GPGPU CodeGen Summary: When compiling for GPU, one can now choose to compile for OpenCL or CUDA, with the corresponding polly-gpu-runtime flag (libopencl / libcudart). The GPURuntime library (GPUJIT) has been extended with the OpenCL Runtime library for that purpose, correctly choosing the corresponding library calls to the option chosen when compiling (via different initialization calls). Additionally, a specific GPU Target architecture can now be chosen with -polly-gpu-arch (only nvptx64 implemented thus far). Reviewers: grosser, bollu, Meinersbur, etherzhhb, singam-sanjay Reviewed By: grosser, Meinersbur Subscribers: singam-sanjay, llvm-commits, pollydev, nemanjai, mgorny, yaxunl, Anastasia Tags: #polly Differential Revision: https://reviews.llvm.org/D32431 llvm-svn: 302215	2017-05-05 07:54:49 +00:00
Michael Kruse	704c03e03b	[ScopBuilder] Add missing semicolon after LLVM_FALLTHROUGH. It was forgotten in r302157. llvm-svn: 302163	2017-05-04 15:55:54 +00:00
Michael Kruse	eedae7630a	Introduce VirtualUse. NFC. If a ScopStmt references a (scalar) value, there are multiple possibilities where this value can come. The decision about what kind of use it is must be handled consistently at different places, which can be error-prone. VirtualUse is meant to centralize the handling of the different types of value uses. This patch makes ScopBuilder and CodeGeneration use VirtualUse. This already helps to show inconsistencies with the value handling. In order to keep this patch NFC, exceptions to the general rules are added. These might be fixed later if they turn to problems. Overall, this should result in fewer post-codegen IR-verification errors, but instead assertion failures in `getNewValue` that are closer to the actual error. Differential Revision: https://reviews.llvm.org/D32667 llvm-svn: 302157	2017-05-04 15:22:57 +00:00
Michael Kruse	45d5cf47bf	[CMake] Remove POLLY_TEST_DIRECTORIES. The test subdirectory POLLY_TEST_DIRECTORIES was heavily outdated and only used in out-of-LLVM-tree builds (to generate polly-test-${subdir} targets). llvm-svn: 302142	2017-05-04 12:21:25 +00:00
Tobias Grosser	3f25a7e8ee	[ScopDetection] Check for already known required-invariant loads [NFC] For certain test cases we spent over 50% of the scop detection time in checking if a load is likely invariant. We can avoid most of these checks by testing early on if a load is expected to be invariant. Doing this reduces scop-detection time on a large benchmark from 52 seconds to just 25 seconds. No functional change is expected. llvm-svn: 302134	2017-05-04 10:16:20 +00:00
Tobias Grosser	1859463876	Adjust test case to not trigger the SCEV optimization committed in r302096 This makes sure we still test the case that a PHI-NODE cannot be analyzed by scalar evolution and consequently must be code generated explicitly. As Michael's optimization triggers only on a very specific "add %iv, %step" pattern, just changing 'add' to 'mul' adds back test coverage. llvm-svn: 302132	2017-05-04 08:56:54 +00:00
Tobias Grosser	e2ccc3fb33	[ScopInfo] Do not use LLVM names to identify statements, arrays, and parameters LLVM-IR names are commonly available in debug builds, but often not in release builds. Hence, using LLVM-IR names to identify statements or memory reference results makes the behavior of Polly depend on the compile mode. This is undesirable. Hence, we now just number the statements instead of using LLVM-IR names to identify them (this issue has previously been brought up by Zino Benaissa). However, as LLVM-IR names help in making test cases more readable, we add an option '-polly-use-llvm-names' to still use LLVM-IR names. This flag is by default set in the polly tests to make test cases more readable. This change reduces the time in ScopInfo from 32 seconds to 2 seconds for the following test case provided by Eli Friedman <efriedma@codeaurora.org> (already used in one of the previous commits): struct X { int x; }; void a(); #define SIG (int x, X y, X z) typedef void (fn)SIG; #define FN { for (int i = 0; i < x; ++i) { (y)[i].x += (*z)[i].x; } a(); } #define FN5 FN FN FN FN FN #define FN25 FN5 FN5 FN5 FN5 #define FN125 FN25 FN25 FN25 FN25 FN25 #define FN250 FN125 FN125 #define FN1250 FN250 FN250 FN250 FN250 FN250 void x SIG { FN1250 } For a larger benchmark I have on-hand (10000 loops), this reduces the time for running -polly-scops from 5 minutes to 4 minutes, a reduction by 20%. The reason for this large speedup is that our previous use of printAsOperand had a quadratic cost, as for each printed and unnamed operand the full function was scanned to find the instruction number that identifies the operand. We do not need to adjust the way memory reference ids are constructured, as they do not use LLVM values. Reviewed by: efriedma Tags: #polly Differential Revision: https://reviews.llvm.org/D32789 llvm-svn: 302072	2017-05-03 20:08:52 +00:00
Siddharth Bhat	88619946b6	[CUDA Managed Memory] Fix regression introduced by Managed Memory - Fixes breakage from commit 5536f. - Interference with commit 764f3 caused testcase to fail. Reverting 764f3 allows commit 5536f to succeed. - Generated kernel code was slightly different due to 764f3, which caused testcase to fail. llvm-svn: 302021	2017-05-03 13:15:27 +00:00
Tobias Grosser	72684bbaf5	[ScopInfo] Remove code not needed anymore after r302004 llvm-svn: 302005	2017-05-03 08:02:32 +00:00
Tobias Grosser	8133128c17	[ScopInfo] Do not add array name into memory reference ids Before this change a memory reference identifier had the form: <STMT>_<ACCESSTYPE><ID>_<MEMREF>, e.g., Stmt_bb9_Write0_MemRef_tmp11 After this change, we use the format: <STMT>_<ACCESSTYPE><ID>, e.g., Stmt_bb9_Write0 The name of the array that is accessed through a memory reference is not necessary to uniquely identify a memory reference, but was only added to provide additional information for debugging. We drop this information now for the following two reasons: 1) This shortens the names and consequently improves readability 2) This removes a second location where we decide on the name of a scop array, leaving us only with the location where the actual scop array is created. Having after 2) only a single location to name scop arrays will allow us to change the naming convention of scop arrays more easily, which we will do in a future commit to reduce compilation time. llvm-svn: 302004	2017-05-03 07:57:35 +00:00
Siddharth Bhat	6c3d19ba45	[NFC] [IslAST] fix typo: "int the" -> "in the" llvm-svn: 301925	2017-05-02 14:54:49 +00:00
Michael Kruse	ecbd57e98a	[CMake] Move PollyCore to Polly project folder. This keeps the artifacts consistently structured in the "Polly" folder of Visual Studio solutions. llvm-svn: 301779	2017-04-30 21:07:05 +00:00
Hongbin Zheng	e9a9932712	[Polly] Make PollyCore depends on intrinsics_gen llvm-svn: 301734	2017-04-29 03:12:17 +00:00
Tobias Grosser	3d76f2ccd3	[tests] Ensure all test cases use named variables This makes it easier to read and possibly even modify the test cases, as there is no need to keep the variable increment in steps of one. More importantly, by using explicit variable names we do not need to rely on the implicit numbering of statements when dumping the scop information. This makes it easier to read and possibly even modify the test cases. Furthermore, by using explicit variables we do not need to rely on the implicit numbering of statements when dumping the scop information. In a future commit, this implicit numbering will likely not be used any more to refer to LLVM-IR values as it is very expensive to construct. llvm-svn: 301689	2017-04-28 21:16:29 +00:00
Tobias Grosser	f13722177b	[Codegen] Disable Polly's codegen verification by default As has been reported in the previous commit, codegen verification can result in quadratic compile time increases for large functions with many scops. This is certainly not something we would like to have in the Polly default configuration. Hence, we disable codegen verification by default -- also to see if this resolves some of the compilation timeouts we currently see on the AOSP buildbots. We still leave this feature in Polly as it has shown _very_ useful for debugging. In fact, we may want to have a discussion if we can bring this feature back in a way that does not impact compilation time so much. Thanks to Eli Friedman <efriedma@codeaurora.org> for reporting this issue and for providing the test case in the previous commit (where I forgot to acknowledge him). llvm-svn: 301670	2017-04-28 19:15:28 +00:00
Tobias Grosser	d439911f73	[CodeGen] Skip verify if -polly-codegen-verify is set to false Before this change, we always tried to verify the function and printed verification errors, but just did not abort in case -polly-codegen-verify=false was set and verification failed. As verification can become very cosly -- for large functions with many scops we may verify the very same function very often -- this can affect compile time very negatively. Hence, we respect the -polly-codegen-verify flag with this check, ensuring that no verification is run if -polly-codegen-verify=false. This reduces code generation time from 26 seconds to 4 seconds on the test case below with -polly-codegen-verify=false: struct X { int x; }; void a(); #define SIG (int x, X y, X z) typedef void (fn)SIG; #define FN { for (int i = 0; i < x; ++i) { (y)[i].x += (*z)[i].x; } a(); } #define FN5 FN FN FN FN FN #define FN25 FN5 FN5 FN5 FN5 #define FN125 FN25 FN25 FN25 FN25 FN25 #define FN250 FN125 FN125 #define FN1250 FN250 FN250 FN250 FN250 FN250 void x SIG { FN1250 } llvm-svn: 301669	2017-04-28 19:08:20 +00:00
Siddharth Bhat	abed49699b	[Polly] [PPCGCodeGeneration] Add managed memory support to GPU code generation. This needs changes to GPURuntime to expose synchronization between host and device. 1. Needs better function naming, I want a better name than "getOrCreateManagedDeviceArray" 2. DeviceAllocations is used by both the managed memory and the non-managed memory path. This exploits the fact that the two code paths are never run together. I'm not sure if this is the best design decision Reviewed by: PhilippSchaad Tags: #polly Differential Revision: https://reviews.llvm.org/D32215 llvm-svn: 301640	2017-04-28 11:16:30 +00:00
Tobias Grosser	287942ae82	Update to isl-0.18-592-gb50ad59 This is just a general maintenance update. llvm-svn: 301624	2017-04-28 06:11:17 +00:00
Tobias Grosser	c96c1d8c87	[ScopInfo] Consider only write-free dereferencable loads as invariant When we introduced in r297375 support for hoisting loads that are known to be dereferencable without any conditional guard, we forgot to keep the check to verify that no other write into the very same location exists. This change ensures now that dereferencable loads are allowed to access everything, but can only be hoisted in case no conflicting write exists. This resolves llvm.org/PR32778 Reported-by: Huihui Zhang <huihuiz@codeaurora.org> llvm-svn: 301582	2017-04-27 20:08:16 +00:00
Michael Kruse	792a6fcc57	[CMake] Use object library to build the two flavours of Polly. Polly comes in two library flavors: One loadable module to use the LLVM framework -load mechanism, and another one that host applications can link to. These have very different requirements for Polly's own dependencies. The loadable module assumes that all its LLVM dependencies are already available in the address space of the host application, and is not allowed to bring in its own copy of any LLVM library (including the NVPTX backend in case of Polly-ACC). The non-module library is intended to be linked to using target_link_libraries. CMake would then resolve all of its dependencies, including NVPTX and ensure that only a single instance of each library will be used. Differential Revision: https://reviews.llvm.org/D32442 llvm-svn: 301558	2017-04-27 16:13:03 +00:00
Philip Pfaffe	5d790fc03c	[Polly][Cmake] Add missing include paths to exported cmake config llvm-svn: 301552	2017-04-27 16:03:42 +00:00
Hongbin Zheng	0f8f177682	[Polly] Do not introduce address space cast Do not introduce address space cast in IslNodeBuilder::preloadUnconditionally. Differential Revision: https://reviews.llvm.org/D32581 llvm-svn: 301519	2017-04-27 06:42:14 +00:00
Michael Kruse	e6d2bebb25	[unittests/DeLICM] Add test for Written vs Written. The interpretation of multiple known ValInsts for the same element and timepoint is that these are alterntivate names for the same values, for instance a PHINode and the incoming value when knowning it was the last executed block. That means that known values do not conflict if there at least (but necessarily all) one common ValInst. This prinviple also applies to Written values. Add a test for this principle. llvm-svn: 301481	2017-04-26 21:52:55 +00:00
Michael Kruse	8080011ca1	[unittests/DeLICM] Add test for Occipied vs Occupied. The interpretation of multiple known ValInsts for the same element and timepoint is that these are alterntivate names for the same values, for instance a PHINode and the incoming value when knowning it was the last executed block. That means that known values do not conflict if there at least (but necessarily all) one common ValInst. Add a case to test this principle. llvm-svn: 301480	2017-04-26 21:52:51 +00:00
Michael Kruse	3e519b949b	[DeLICM] Use Known information when comparing Occupied and Written. Do not conflict if a write writes the same value as already known. This change only affects unit tests, but no functional changes are expected on LLVM-IR, as no Known information is yet extracted and consequently this functionality is only triggered through unit tests. Differential Revision: https://reviews.llvm.org/D32026 llvm-svn: 301460	2017-04-26 20:35:07 +00:00
Tobias Grosser	1c3eebac08	Update to isl-0.18-423-g30331fe This is just a general maintenance update. llvm-svn: 301433	2017-04-26 17:08:02 +00:00
Michael Kruse	cd2be66bf0	[DeLICM] Use Known information when comparing Existing.Occupied and Proposed.Occupied. Do not conflict if the value of Existing and Proposed are the same. This change only affects unit tests, but no functional changes are expected on LLVM-IR, as no Known information is yet extracted and consequently this functionality is only triggered through unit tests. Differential Revision: https://reviews.llvm.org/D32025 llvm-svn: 301301	2017-04-25 10:57:32 +00:00
Siddharth Bhat	d277feda91	[PPCGCodeGeneration] Update PPCG Code Generation for OpenCL compatibility Added a small change to the way pointer arguments are set in the kernel code generation. The way the pointer is retrieved now, specifically requests global address space to be annotated. This is necessary, if the IR should be run through NVPTX to generate OpenCL compatible PTX. The changes do not affect the PTX Strings generated for the CUDA target (nvptx64-nvidia-cuda), but are necessary for OpenCL (nvptx64-nvidia-nvcl). Additionally, the data layout has been updated to what the NVPTX Backend requests/recommends. Contributed-by: Philipp Schaad Reviewers: Meinersbur, grosser, bollu Reviewed By: grosser, bollu Subscribers: jlebar, pollydev, llvm-commits, nemanjai, yaxunl, Anastasia Tags: #polly Differential Revision: https://reviews.llvm.org/D32215 llvm-svn: 301299	2017-04-25 08:08:29 +00:00
Michael Kruse	a8b0be819a	[unittests] Derive Occupied from Unused when given. When both, OccupiedAndKnown and Unused are given, use the former only for the Known values. The relation Unused \union Occupied must always hold. This allows us to specify Known independently of Occupied. It is needed for an artificial test case in https://reviews.llvm.org/D32025. llvm-svn: 301284	2017-04-25 00:30:42 +00:00
Michael Kruse	b745b740f9	[unittests] Add postcondition to completeLifetime. llvm-svn: 301283	2017-04-25 00:30:32 +00:00
Siddharth Bhat	729377f063	[Polly] [DependenceInfo] change WAR generation, Read will not block Read Earlier, the call to buildFlow was: WAR = buildFlow(Write, Read, MustWrite, Schedule). This meant that Read could block another Read, since must-sources can block each other. Fixed the call to buildFlow to correctly compute Read. The resulting code needs to do some ISL juggling to get the output we want. Bug report: https://bugs.llvm.org/show_bug.cgi?id=32623 Reviewers: Meinersbur Tags: #polly Differential Revision: https://reviews.llvm.org/D32011 llvm-svn: 301266	2017-04-24 22:23:12 +00:00
Tobias Grosser	9b34a08b19	[isl C++ bindings] Add explicit const casts for foreach bindings This avoids a compiler warning about lost 'const' attributes. Suggested-by: Michael Kruse <llvm@meinersbur.de> llvm-svn: 301108	2017-04-23 07:54:12 +00:00
Michael Kruse	abf05b18db	[CMake] Fix polly-isl-test execution in out-of-LLVM-tree builds. The isl unittest modified its PATH variable to point to the LLVM bin dir. When building out-of-LLVM-tree, it does not contain the polly-isl-test executable, hence the test fails. Ensure that the polly-isl-test is written to a bin directory in the build root, just like it would happen in an inside-LLVM build. Then, change PATH to include that dir such that the executable in it is prioritized before any other location. llvm-svn: 301096	2017-04-22 23:02:53 +00:00
Michael Kruse	9c19d1f3aa	[CMake] Fix unittests in out-of-LLVM-tree builds. Unittests are linked against a subset of LLVM libraries and its transitive dependencies resolved by CMake. The information about indirect library dependency is not available when building separately from LLVM, which result in missing symbol errors while linking. Resolve this issue by querying llvm-config about the available LLVM libraries and link against all of them, since dependence information is still not available. llvm-svn: 301095	2017-04-22 23:02:46 +00:00
Michael Kruse	ab6b47d2e7	[CMake] Link unittests only against libLLVM.so, if available. We can only link against libLLVM.so or the individual libLLVM*.so components, but not both of them. Doing so results in these components exist twice in the programs address space, since it is already contained in libLLVM.so. The observable effect of this is that command line switches are registered multiple times (once for each instance), which is an error. This fixes llvm.org/PR32735. Reported-by: Singapuram Sanjay Srivallabh <singapuram.sanjay@gmail.com> llvm-svn: 301020	2017-04-21 19:03:51 +00:00
Tobias Grosser	9e6c00194f	GICHelper: remove forgotten isl foreach declarations These should have been dropped in r300323. Suggested-by: Michael Kruse <llvm@meinersbur.de> llvm-svn: 300965	2017-04-21 10:50:33 +00:00
Michael Kruse	8431e996d3	[DeLICM] Use Known information when comparing Existing.Written and Proposed.Written. This change only affects unit tests, but no functional changes are expected on LLVM-IR, as no Known information is yet extracted and consequently this functionality is only triggered through unit tests. Differential Revision: https://reviews.llvm.org/D32027 llvm-svn: 300874	2017-04-20 19:16:39 +00:00
Tobias Grosser	1f8b84094f	Update isl bindings to latest version (+ Polly extensions) After the isl C++ binding generator is now close to being upstreamed to isl, we synchronize the latest changes to Polly. These are mostly formatting changes plus a small interface change for the foreach callback function and some naming changes in isl::boolean. llvm-svn: 300398	2017-04-15 08:15:54 +00:00
Tobias Grosser	75aa1a9a49	Use isl C++ foreach implementation This commit switches Polly over to the isl::obj::foreach_* implementation, which is part of the new isl bindings and follows the foreach pattern established in Polly by Michael Kruse. The original isl C function: isl_stat isl_union_set_foreach_set(__isl_keep isl_union_set uset, isl_stat (fn)(__isl_take isl_set set, void user), void user); which required the user to define a static callback function to which all interesting parameters are passed via a 'void ' user-pointer, is on the C++ side available as a function that takes a std::function<>, which can carry any additional arguments without the need for a user pointer: stat UnionSet::foreach_set(const std::function<stat(set)> &fn) const; The following code illustrates the use of the new C++ interface: auto Lambda = [=, &Result](isl::set Set) -> isl::stat { auto Shifted = shiftDimension(Set, Pos, Amount); Result = Result.add(Shifted); return isl::stat::ok; } UnionSet.foreach_set(Lambda); Polly had some specialized foreach functions which did not require the lambdas to return a status flag. We remove these functions in this commit to move Polly completely over to the new isl interface. We may in the future discuss if functors without return values can be supported easily. Another extension proposed by Michael Kruse is the use of C++ iterators to allow the use of normal for loops to iterate over these sets. Such an extension would allow us to further simplify the code. Reviewed-by: Michael Kruse <llvm@meinersbur.de> Differential Revision: https://reviews.llvm.org/D30620 llvm-svn: 300323	2017-04-14 13:39:40 +00:00
Michael Kruse	a8e885d87c	[DeLICM] Introduce unittesting infrastructure for Known and Written. NFC. llvm-svn: 300212	2017-04-13 16:32:46 +00:00
Michael Kruse	72f3922534	[DeLICM] Export Known and Written to DeLICMTests. NFC. This will allow unittesting of new functionality based on Known and Written. llvm-svn: 300211	2017-04-13 16:32:39 +00:00
Michael Kruse	a2acc11949	[DeLICM] Add Knowledge::Known. NFC. This field will later contain a ValInst that is known to be stored in an occupied array element. llvm-svn: 300210	2017-04-13 16:32:31 +00:00
Michael Kruse	fa7c8cdfc6	[DeLICM] Make Knowledge::Written an isl::union_map. NFC. The map will later point to a ValInst that is written. llvm-svn: 300208	2017-04-13 16:32:25 +00:00
Michael Kruse	5e6456979b	[DeLICM] Rename Knowledge to KnowledgeStr. NFC. Some debuggers get confused by different class of the same name defined independently in different translation units. llvm-svn: 300207	2017-04-13 16:32:16 +00:00
Tobias Grosser	7b5a4dfd46	Exploit BasicBlock::getModule to shorten code Suggested-by: Roman Gareev <gareevroman@gmail.com> llvm-svn: 299914	2017-04-11 04:59:13 +00:00
Tobias Grosser	67726b3260	SAdjust to recent change in constructor definition of AllocaInst llvm-svn: 299913	2017-04-11 04:23:38 +00:00
Matt Arsenault	b3e30c32ce	Update for alloca construction changes llvm-svn: 299905	2017-04-11 00:12:58 +00:00
Philip Pfaffe	78265cd237	Fix missing .git/indexloadPolly in ensure-correct-tile-sizes testcase llvm-svn: 299765	2017-04-07 12:55:26 +00:00
Roman Gareev	9d4d91ca6a	[FIX] Fix ScheduleTreeOptimizer::optimizeMatMulPattern Use new values of the dimensions during their permutation. llvm-svn: 299663	2017-04-06 17:25:08 +00:00
Roman Gareev	e0d466342b	Restore the initial ordering of dimensions before applying the pattern matching Dimensions of band nodes can be implicitly permuted by the algorithm applied during the schedule generation. For example, in case of the following matrix-matrix multiplication, for (i = 0; i < 1024; i++) for (k = 0; k < 1024; k++) for (j = 0; j < 1024; j++) C[i][j] += A[i][k] * B[k][j]; it can produce the following schedule tree domain: "{ Stmt_for_body6[i0, i1, i2] : 0 <= i0 <= 1023 and 0 <= i1 <= 1023 and 0 <= i2 <= 1023 }" child: schedule: "[{ Stmt_for_body6[i0, i1, i2] -> [(i0)] }, { Stmt_for_body6[i0, i1, i2] -> [(i1)] }, { Stmt_for_body6[i0, i1, i2] -> [(i2)] }]" permutable: 1 coincident: [ 1, 1, 0 ] The current implementation of the pattern matching optimizations relies on the initial ordering of dimensions. Otherwise, it can produce the miscompilation (e.g., [1]). This patch helps to restore the initial ordering of dimensions by recreating the band node when the corresponding conditions are satisfied. Refs.: [1] - https://bugs.llvm.org/show_bug.cgi?id=32500 Reviewed-by: Michael Kruse <llvm@meinersbur.de> Differential Revision: https://reviews.llvm.org/D31741 llvm-svn: 299662	2017-04-06 17:09:54 +00:00
Siddharth Bhat	5eeb1dd42e	[Polly] [ScheduleOptimizer] Prevent incorrect tile size computation Because Polly exposes parameters that directly influence tile size calculations, one can setup situations like divide-by-zero. Check against a possible divide-by-zero in getMacroKernelParams and return early. Also assert at the end of getMacroKernelParams that the block sizes computed for matrices are positive (>= 1). Tags: #polly Differential Revision: https://reviews.llvm.org/D31708 llvm-svn: 299633	2017-04-06 08:20:22 +00:00
Tobias Grosser	0d622a4bf9	Update to isl-0.18-417-gb9e7334 This is a regular maintenance update. llvm-svn: 299617	2017-04-06 03:41:47 +00:00
Michael Kruse	895f5d8080	Remove llvm.lifetime.start/end in original region. The current StackColoring algorithm does not correctly handle the situation when some, but not all paths from a BB to the entry node cross a llvm.lifetime.start. According to an interpretation of the language reference at http://llvm.org/docs/LangRef.html#llvm-lifetime-start-intrinsic this might be correct, but it would cost too much effort to handle in StackColoring. To be on the safe side, remove all lifetime markers even in the original code version (they have never been copied to the optimized version) to ensure that no path to the entry block will cross a llvm.lifetime.start. The same principle applies to paths the a function return and the llvm.lifetime.end marker, so we remove them as well. This fixes llvm.org/PR32251. Also see the discussion at http://lists.llvm.org/pipermail/llvm-dev/2017-March/111551.html llvm-svn: 299585	2017-04-05 20:09:59 +00:00
Tobias Grosser	59e42b8f96	Add two Polly images llvm-svn: 299534	2017-04-05 11:50:31 +00:00
Siddharth Bhat	bcbfdade41	[Polly] [DependenceInfo] change WAR, WAW generation to correct semantics = Change of WAR, WAW generation: = - `buildFlow(Sink, MustSource, MaySource, Sink)` treates any flow of the form `sink <- may source <- must source` as a may dependence. - we used to call: ```lang=cpp, name=old-flow-call.cpp Flow = buildFlow(MustWrite, MustWrite, Read, Schedule); WAW = isl_union_flow_get_must_dependence(Flow); WAR = isl_union_flow_get_may_dependence(Flow); ``` - This caused some WAW dependences to be treated as WAR dependences. - Incorrect semantics. - Now, we call WAR and WAW correctly. == Correct WAW: == ```lang=cpp, name=new-waw-call.cpp Flow = buildFlow(Write, MustWrite, MayWrite, Schedule); WAW = isl_union_flow_get_may_dependence(Flow); isl_union_flow_free(Flow); ``` == Correct WAR: == ```lang=cpp, name=new-war-call.cpp Flow = buildFlow(Write, Read, MustaWrite, Schedule); WAR = isl_union_flow_get_must_dependence(Flow); isl_union_flow_free(Flow); ``` - We want the "shortest" WAR possible (exact dependences). - We mark all the must-writes as may-source, reads as must-souce. - Then, we ask for must dependence. - This removes all the reads that flow through a must-write before reaching a sink. - Note that we only block ealier writes with must-writes. This is intuitively correct, as we do not want may-writes to block must-writes. - Leaves us with direct (R -> W). - This affects reduction generation since RED is built using WAW and WAR. = New StrictWAW for Reductions: = - We used to call: ```lang=cpp,name=old-waw-war-call.cpp Flow = buildFlow(MustWrite, MustWrite, Read, Schedule); WAW = isl_union_flow_get_must_dependence(Flow); WAR = isl_union_flow_get_may_dependence(Flow); ``` - This is the right model of WAW we need for reductions, just not in general. - Reductions need to track only strict WAW, without any interfering reductions. = Explanation: Why the new WAR dependences in tests are correct: = - We no longer set WAR = WAR - WAW - Hence, we will have WAR dependences that were originally removed. - These may look incorrect, but in fact make sense. == Code: == ```lang=llvm, name=new-war-dependence.ll ; void manyreductions(long A) { ; for (long i = 0; i < 1024; i++) ; for (long j = 0; j < 1024; j++) ; S0: A += 42; ; ; for (long i = 0; i < 1024; i++) ; for (long j = 0; j < 1024; j++) ; S1: A += 42; ; ``` === WAR dependence: === { S0[1023, 1023] -> S1[0, 0] } - Between `S0[1023, 1023]` and `S1[0, 0]`, we will have the dependences: ```lang=cpp, name=dependence-incorrect, counterexample S0[1023, 1023]: -- tmp = A (load0)-- WAR 2 add = tmp + 42 \| -> A = add (store0) \| WAR 1 S1[0, 0]: \| tmp = A (load1) \| add = tmp + 42 \| A = add (store1)<- ``` - One may assume that WAR2 hides WAR1 (since store0 happens before store1). However, within a statement, Polly has no idea about the ordering of loads and stores. - Hence, according to Polly, the code may have looked like this: ```lang=cpp, name=dependence-correct S0[1023, 1023]: A = add (store0) tmp = A (load0) ---* add = A + 42 \| WAR 1 S1[0, 0]: \| tmp = A (load1) \| add = A + 42 \| A = add (store1) <-* ``` - So, Polly generates (correct) WAR dependences. It does not make sense to remove these dependences, since they are correct with respect to Polly's model. Reviewers: grosser, Meinersbur tags: #polly Differential revision: https://reviews.llvm.org/D31386 llvm-svn: 299429	2017-04-04 13:08:23 +00:00
Philip Pfaffe	447f175eb5	Fix formatting in LoopGenerators llvm-svn: 299424	2017-04-04 10:22:17 +00:00
Philip Pfaffe	2d950f36ee	[Polly][NewPM] Pull references to the legacy PM interface from utilities and helpers Summary: A couple of the utilities used to analyze or build IR make explicit use of the legacy PM on their interface, to access analysis results. This patch removes the legacy PM from the interface, and just passes the required results directly. This shouldn't introduce any function changes, although the API technically allowed to obtain two different analysis results before, one passed by reference and one through the PM. I don't believe that was ever intended, however. Reviewers: grosser, Meinersbur Reviewed By: grosser Subscribers: nemanjai, pollydev, llvm-commits Tags: #polly Differential Revision: https://reviews.llvm.org/D31653 llvm-svn: 299423	2017-04-04 10:01:53 +00:00
Tobias Grosser	637be04b77	[PerfMonitor] Use Intrinsics::getDeclaration Instead of creating the declaration ourselves, we obtain it directly from the LLVM intrinsic definitions. This addresses a post-review comment for r299359. Suggested-by: Hongzing Zheng <etherzhhb@gmail.com> llvm-svn: 299360	2017-04-03 15:23:08 +00:00
Tobias Grosser	65371af2e1	[CodeGen] Add Performance Monitor Add support for -polly-codegen-perf-monitoring. When performance monitoring is enabled, we emit performance monitoring code during code generation that prints after program exit statistics about the total number of cycles executed as well as the number of cycles spent in scops. This gives an estimate on how useful polyhedral optimizations might be for a given program. Example output: Polly runtime information ------------------------- Total: 783110081637 Scops: 663718949365 In the future, we might also add functionality to measure how much time is spent in optimized scops and how many cycles are spent in the fallback code. Reviewers: bollu,sebpop Tags: #polly Differential Revision: https://reviews.llvm.org/D31599 llvm-svn: 299359	2017-04-03 14:55:37 +00:00
Michael Kruse	0b8949e6ed	[test] Fix two testcases. NFC. Trivial fix for two testcases. When Polly isn't linked into opt, independent of whether it's built in-tree or not, these testcases forget to load the appropriate library. Contributed-by: Philip Pfaffe <philip.pfaffe@gmail.com> Differential Revision: https://reviews.llvm.org/D31596 llvm-svn: 299357	2017-04-03 12:37:10 +00:00
Michael Kruse	6e7854a560	[ScopInfo] Fix typos in option description. llvm-svn: 299356	2017-04-03 12:03:38 +00:00
Tobias Grosser	bd96c73a1a	Add test case for r299352. llvm-svn: 299353	2017-04-03 07:44:23 +00:00
Tobias Grosser	696a1ee99d	[PollyIRBuilder] Bound size of alias metadata No-alias metadata grows quadratic in the size of arrays involved, which can become very costly for large programs. This commit bounds the number of arrays for which we construct no-alias information to ten. This is conservatively correct, as we just provide less information to LLVM and speeds up the compile time of one of my internal test cases from 'does-not-terminate' to 'finishes-in-less-than-a-minute'. In the future we might try to be more clever here, but this change should provide a good baseline. llvm-svn: 299352	2017-04-03 07:42:50 +00:00
Tobias Grosser	af940ae280	Update to isl-0.18-410-gc253447 This is a regular maintenance update to ensure latest isl changes are tested in our buildbots. llvm-svn: 299350	2017-04-03 06:46:16 +00:00
Huihui Zhang	d6d6a3f2ee	revert test commit r299024 llvm-svn: 299026	2017-03-29 20:23:56 +00:00
Huihui Zhang	9d19e9d232	test commit, add blank line llvm-svn: 299024	2017-03-29 20:10:45 +00:00
Michael Kruse	c3e9c1442d	[ScopInfo] Introduce ScopStmt::contains(BB*). NFC. Provide an common way for testing if a statement contains something for region and block statements. First user is RegionGenerator::addOperandToPHI. Suggested-by: Tobias Grosser <tobias@grosser.es> llvm-svn: 298617	2017-03-23 16:12:21 +00:00
Tobias Grosser	1f7e7d3d93	Update to isl-0.18-402-ga30c537 This is a regular maintenance update. llvm-svn: 298595	2017-03-23 13:38:24 +00:00
Michael Kruse	9e4e7b467f	[DeLICM] Add const qualifiers. NFC. llvm-svn: 298546	2017-03-22 20:09:58 +00:00
Michael Kruse	174f483990	[Support] Add functions to ISLTools. Add shiftDim and convertZoneToTimepoints overloads for isl maps. Add distributeDomain, liftDomains and applyDomainRange functions. These are going to be used in https://reviews.llvm.org/D31247 (Add known array contents to Knowledge) llvm-svn: 298543	2017-03-22 19:31:06 +00:00
Michael Kruse	d07d155ebb	[DeLICM] Remove overloaded Knowledge constructor. NFC. The isl C++ bindings now has implicit conversions from isl::set to isl::union_set. Therefore the additional overload accepting isl::set is not required anymore. llvm-svn: 298529	2017-03-22 18:01:23 +00:00
Michael Kruse	29143ec3f7	[DeLICM] Remove AllElements. NFC. It is not used and will not be used (anymore) in future commits. llvm-svn: 298522	2017-03-22 17:18:39 +00:00
Roman Gareev	cdfb57dc46	Introduce another level of metadata to distinguish non-aliasing accesses Introduce another level of alias metadata to distinguish the individual non-aliasing accesses that have inter iteration alias-free base pointers marked with "Inter iteration alias-free" mark nodes. It can be used to, for example, distinguish different stores (loads) produced by unrolling of the innermost loops and, subsequently, sink (hoist) them by LICM. Reviewed-by: Tobias Grosser <tobias@grosser.es> Differential Revision: https://reviews.llvm.org/D30606 llvm-svn: 298510	2017-03-22 14:25:24 +00:00
Roman Gareev	23df27682a	Map the new load to the base pointer of the invariant load hoisted load Map the new load to the base pointer of the invariant load hoisted load to be able to find the alias information for it. Reviewed-by: Tobias Grosser <tobias@grosser.es> Differential Revision: https://reviews.llvm.org/D30605 llvm-svn: 298507	2017-03-22 13:57:53 +00:00
Siddharth Bhat	44b6cb4e63	[DependenceInfo] change name Write to MustWrite to remove ambiguity [NFC] "Write" is an overloaded term. In collectInfo() till buildFlow(), it is used to mean "must writes". However, within the memory based analysis, it is used to mean "both may and must writes". Renaming the Write variable helps clarify this difference. Reviewers: grosser Tags: #polly Differential Revision: https://reviews.llvm.org/D31181 llvm-svn: 298361	2017-03-21 11:54:08 +00:00
Tobias Grosser	29eaa16b7e	Update isl to isl-0.18-395-g77701b3 This is a normal maintenance update. llvm-svn: 298352	2017-03-21 09:12:11 +00:00
Michael Kruse	0d10696693	[DeLICM] Refector out parseSetOrNull. NFC. Note that the isl::union_set(isl_ctx,std::string) constructor will auto-convert the char* to an std::string. Converting a nullptr to std::string is undefined in C++11 (sect. 21.4.2.9). llvm-svn: 298259	2017-03-20 15:37:32 +00:00
Michael Kruse	d75d56e9bf	[DeLICM] Add forgotten isl_space_set_tuple_id in unittests. Otherwise the isl_id NewId which ensures uniqueness of the created space is unused. None of the tests currently uses an nameless tuple, so there is not change in what is tested. llvm-svn: 298258	2017-03-20 15:24:45 +00:00
Tobias Grosser	b28f86e9e6	[CodeGen] Remove need for all parameters to be in scop context for load hoisting. When not adding constraints on parameters using -polly-ignore-parameter-bounds, the context may not necessarily list all parameter dimensions. To support code generation in this situation, we now always iterate over the actual parameter list, rather than relying on the context to list all parameter dimensions. llvm-svn: 298197	2017-03-18 23:12:49 +00:00
Tobias Grosser	1be726a40d	[IslExprBuilder] Print accessed memory locations with RuntimeDebugBuilder After this change, enabling -polly-codegen-add-debug-printing in combination with -polly-codegen-generate-expressions allows us to instrument the compiled binaries to not only print the values stored and loaded to a given memory access, but also to print the accessed location with array name and per-dimension offset: MemRef_A[3][2] Store to 6299784: 5.000000 MemRef_A[3][3] Load from 6299788: 0.000000 MemRef_A[3][3] Store to 6299788: 6.000000 This can be very helpful for debugging. llvm-svn: 298194	2017-03-18 20:54:43 +00:00
Tobias Grosser	7693b116a1	[OpenMP] Do not emit lifetime markers for context In commit r219005 lifetime markers have been introduced to mark the lifetime of the OpenMP context data structure. However, their use seems incorrect and recently caused a miscompile in ASC_Sequoia/CrystalMk after r298053 which was not at all related to r298053. r298053 only caused a change in the loop order, as this change resulted in a different isl internal representation which caused the scheduler to derive a different schedule. This change then caused the IR to change, which apparently created a pattern in which LLVM exploites the lifetime markers. It seems we are using the OpenMP context outside of the lifetime markers. Even though CrystalMk could probably be fixed by expanding the scope of the lifetime markers, it is not clear what happens in case the OpenMP function call is in a loop which will cause a sequence of starting and ending lifetimes. As it is unlikely that the lifetime markers give any performance benefit, we just drop them to remove complexity. llvm-svn: 298192	2017-03-18 20:10:07 +00:00
Siddharth Bhat	3e4a7d38ab	[ScheduleOptimiser] fix typos in top comment [NFC] coice -> choice Transations -> Transactions llvm-svn: 298095	2017-03-17 14:52:19 +00:00
Michael Kruse	89b1f94e64	Revert "Remove references to AssumptionCache. NFC." The AssumptionCache removal of r289756 has been reverted in r290086/r290087. A different solution has been implemented in r291671 which keeps the AssumptionCache. We can therefore use it again in Polly. This reverts r289791. llvm-svn: 298089	2017-03-17 13:56:53 +00:00
Siddharth Bhat	4fe11cf95f	[DependenceInfo] Remove idempotent union: must-writes with may-writes [NFC] Since may-writes are always a superset of the must-writes, there is no point in taking a union of one with the other. llvm-svn: 298085	2017-03-17 13:26:10 +00:00
Michael Kruse	9b91c62e3a	[ScopInfo/PruneUnprofitable] Move default profitability check. In the previous default ScopInfo applied the profitability heuristic for scalar accesses (-polly-unprofitable-scalar-accs=true) and the -polly-prune-unprofitable was disabled by default (-polly-enable-prune-unprofitable=false) as that pruning was already done. This changes switches the defaults to -polly-unprofitable-scalar-accs=true -polly-enable-prune-unprofitable=false such that the scalar access heuristic check is done by the pass. This allows passes between ScopInfo and PruneUnprofitable to optimize away scalar accesses. Without enabling such intermediate passes, there is no change in behaviour of profitability checks in a PassManagerBuilder built pass chain, but it allows us to cover this configuration with the buildbots. Suggested-by: Tobias Grosser <tobias@grosser.es> llvm-svn: 298081	2017-03-17 13:10:05 +00:00
Michael Kruse	f3091bf4cf	[PruneUnprofitable] Add -polly-prune-unprofitable pass. ScopInfo's normal profitability heuristic considers SCoPs where all statements have scalar writes as not profitably optimizable and invalidate the SCoP in that case. However, -polly-delicm and -polly-simplify may be able to remove some of the scalar writes such that the flag -polly-unprofitable-scalar-accs=false allows disabling that part of the heuristic. In cases where DeLICM (or other passes after ScopInfo) are not successful in removing scalar writes, the SCoP is still not profitably optimizable. The schedule optimizer would again try computing another schedule, resulting in slower compilation. The -polly-prune-unprofitable pass applies the profitability heuristic again before the schedule optimizer Polly can still bail out even with -polly-unprofitable-scalar-accs=false. Differential Revision: https://reviews.llvm.org/D31033 llvm-svn: 298080	2017-03-17 13:09:52 +00:00
Tobias Grosser	5842dee251	[ScopInfo] Add option to not add parameter bounds to context [NFC] For experiments it is sometimes helpful to provide parameter bound information to polly and to not use these parameter bounds for simplification. Add a new option "-polly-ignore-parameter-bounds" which does precisely this. llvm-svn: 298077	2017-03-17 13:00:53 +00:00
Siddharth Bhat	db5dd14cbb	[DependenceInfo] Replace use of deprecated isl_dim_n_out [NFC] Change isl_dim_n_out to isl_map_dim(*, isl_dim_out) llvm-svn: 298075	2017-03-17 12:59:01 +00:00
Siddharth Bhat	65f3d5201e	[DependenceInfo] Track may-writes and build flow information in Dependences::calculateDependences. This ensures that we handle may-writes correctly when building dependence information. Also add a test case checking correctness of may-write information. Not handling it before was an oversight. Differential Revision: https://reviews.llvm.org/D31075 llvm-svn: 298074	2017-03-17 12:31:28 +00:00
Tobias Grosser	8a6e605e96	[ScopInfo] Do not take inbounds assumptions [NFC] For experiments it is sometimes helpful to not take any inbounds assumptions. Add a new option "-polly-ignore-inbounds" which does precisely this. llvm-svn: 298073	2017-03-17 12:26:58 +00:00
Tobias Grosser	b58ed8d3cd	[ScopInfo] Do not try to eliminate parameter dimensions that do not exist In subsequent changes we will make Polly a little bit more lazy in adding parameter dimensions to different sets. As a result, not all parameters will always be part of the parameter space. This change ensures that we do not use the '-1' returned when a parameter dimension cannot be found, but instead just do not try to eliminate the anyhow non-existing dimension. llvm-svn: 298054	2017-03-17 09:02:53 +00:00
Tobias Grosser	941cb7d979	[ScopInfo] Do not expand getDomains() to full parameter space. Since several years, isl can perform most operations on sets with differing parameter spaces, by expanding the parameter space on demand relying using named isl ids to distinguish different parameter dimensions. By not always expanding to full dimensionality the set remain smaller and can likely be operated on faster. This change by itself did not yet result in measurable performance benefits, but it is a step into the right direction needed to ensure that subsequent changes indeed can work with lower-dimensional sets and these sets do not get blown up by accident when later intersected with the domain context. llvm-svn: 298053	2017-03-17 09:02:50 +00:00
Tobias Grosser	f4fe34bfb8	Update to isl-0.18-387-g3fa6191 This is a normal / regular maintenance update. llvm-svn: 297999	2017-03-16 21:33:20 +00:00
Siddharth Bhat	65c4026992	Set Dependences::RED to be non-null once Dependences::calculateDependences() occurs, even if there is no actual reduction. This ensures correctness with isl operations. llvm-svn: 297981	2017-03-16 20:06:49 +00:00
Michael Kruse	5545407fa4	[ScopInfo] Introduce ScopStmt::getSurroundingLoop(). NFC. Introduce ScopStmt::getSurroundingLoop() to replace getFirstNonBoxedLoopFor. getSurroundingLoop() returns the precomputed surrounding/first non-boxed loop. Except in ScopDetection, the list of boxed loops is only used to get the surrounding loop. getFirstNonBoxedLoopFor also requires LoopInfo at every use which is not necessarily available everywhere where we may want to use it. Differential Revision: https://reviews.llvm.org/D30985 llvm-svn: 297899	2017-03-15 22:16:43 +00:00
Tobias Grosser	d614b3e6bd	Preserve the isl-noexceptions.h C++ bindings when updating isl The bindings currently need to be generated manually, as they are not yet part of the official isl distribution. Hence, we keep them across updates assuming they only need to be updated when new functions or functionality should be exposed. llvm-svn: 297710	2017-03-14 07:46:28 +00:00
Tobias Grosser	9c19a0e16a	Add back header file that was accidentally dropped in previous update llvm-svn: 297709	2017-03-14 07:39:05 +00:00
Tobias Grosser	593ebdfbd1	Update to isl-0.18-369-g5e613c6 This is a regular maintenance update. llvm-svn: 297708	2017-03-14 07:33:26 +00:00
Tobias Grosser	c9d4cb2f42	[ScheduleOptimizer] Allow tiling after fusion In ScheduleOptimizer::isTileableBand(), allow the case in which the band node's child is an isl_schedule_sequence_node and its grandchildren isl_schedule_leaf_nodes. This case can arise when two or more statements are fused by the isl scheduler. The tile_after_fusion.ll test has two statements in separate loop nests and checks whether they are tiled after being fused when polly-opt-fusion equals "max". Reviewers: grosser Subscribers: gareevroman, pollydev Tags: #polly Contributed-by: Theodoros Theodoridis <theodort@student.ethz.ch> Differential Revision: https://reviews.llvm.org/D30815 llvm-svn: 297587	2017-03-12 19:02:31 +00:00
Tobias Grosser	de244eb450	Possible error in doc comment If a SCoP is most probably sequential, then it's better to run it on a CPU. Hence, there's no point in running it on a GPU. Reviewers: grosser Subscribers: nemanjai Tags: #polly Contributed-by: Singapuram Sanjay <singapuram.sanjay@gmail.com> Differential Revision: https://reviews.llvm.org/D30864 llvm-svn: 297578	2017-03-12 08:19:01 +00:00
Tobias Grosser	b2347dc241	[isl++] Add missing /* implicit */ marker llvm-svn: 297577	2017-03-12 08:17:50 +00:00
Tobias Grosser	5ac963743f	[isl++] Add last set of missing isl:: prefixes to increase consistency [NFC] llvm-svn: 297558	2017-03-11 07:58:12 +00:00
Tobias Grosser	9cc7e3561d	[unittest] Do not convert large unsigned long to isl::val Currently the isl::val constructor only takes a signed long as parameter, which on Windows is only 32 bit large and can consequently not be used to obtain the same result when loading from the expression '(1ull << 32) - 1)' that we get when loading this value via isl_val_int_from_ui or when loading the value on Linux systems with 64 bit long data types. We avoid this issue by performing the shift and subtractiong within the isl::val. It would be nice to teach the isl++ bindings to construct isl::val from other integer types, but the current interface is not sufficient to do so. If constructors from both signed long and unsigned long are exposed, then integer literals that are of type 'int' and which must be converted to 'long' to match the constructor have two ambigious constructors to choose from, which result in a compiler error. The right solution is likely to additionally expose constructors from signed and unsigned int, but these are not yet available in the isl C interface and adding those adhoc to our bindings is something I would like to avoid for now. We should address this issue with a proper discussion on the isl side. llvm-svn: 297522	2017-03-10 22:25:39 +00:00
Tobias Grosser	d67d368e12	[isl++] Add namespace prefixes to isl::ctx and isl::stat These were missed in r297478. We add them for consistency. llvm-svn: 297520	2017-03-10 22:10:19 +00:00
Tobias Grosser	30a06dce68	[isl++] Drop warning about experimental status As most discussions about these bindings have concluded and only the final patch review on the isl mailing list is missing, we drop the experimental warning tag to match the patchset we will submit to isl, which is expected to not change notably any more. llvm-svn: 297519	2017-03-10 22:10:15 +00:00
Tobias Grosser	9839774e5d	[isl++] Do not use enum prefix Instead of declaring a function as: inline val plain_get_val_if_fixed(enum dim type, unsigned int pos) const; we use: inline isl::val plain_get_val_if_fixed(isl::dim type, unsigned int pos) const; The first argument caused the following compile time error on windows: "error C3431: 'dim': a scoped enumeration cannot be redeclared as an unscoped enumeration" In some cases it is sufficient to just drop the 'enum' prefix, but for example for isl::set the 'enum class dim' type collides with the function name isl::set::dim and can consequently not be referenced. To avoid such kind of ambiguities in the future we add the isl:: prefix consistently to all types used. Reported-by: Michael Kruse <llvm@meinersbur.de> llvm-svn: 297478	2017-03-10 17:01:30 +00:00
Michael Kruse	0446d81e2d	[Simplify] Add -polly-simplify pass. This new pass removes unnecessary accesses and writes. It currently supports 2 simplifications, but more are planned. It removes write accesses that write a loaded value back to the location it was loaded from. It is a typical artifact from DeLICM. Removing it will get rid of bogus dependencies later in dependency analysis. It also removes statements without side-effects. ScopInfo already removes these, but the removal of unnecessary writes can result in more side-effect free statements. Differential Revision: https://reviews.llvm.org/D30820 llvm-svn: 297473	2017-03-10 16:05:24 +00:00
Tobias Grosser	3e618c33fe	[DeadCodeElimination] Translate to C++ bindings This pass is a small and self-contained example of a piece of code that was written with the isl C interface. The diff of this change nicely shows how the C++ bindings can improve the readability of the code by avoiding the long C function names and by avoiding any need for memory management. As you will see, no calls to isl__copy or isl__free are needed anymore. Instead the C++ interface takes care of automatically managing the objects. This may introduce internally additional copies, but due to the isl reference counting, such copies are expected to be cheap. For performance critical operations, we will later exploit move semantics to eliminate unnecessary copies that have shown to be costly. Below we give a set of examples that shows the benefit of the C++ interface vs. the pure C interface. Check properties ---------------- Before: if (isl_aff_is_zero(aff) \|\| isl_aff_is_one(aff)) return true; After: if (Aff.is_zero() \|\| Aff.is_one()) return true; Type conversion --------------- Before: isl_union_pw_multi_aff UPMA = isl_union_pw_multi_aff_from_union_map(umap); After: isl::union_pw_multi_aff UPMA = UMap; Type construction ----------------- Before: auto Empty = isl_union_map_empty(space); After: auto Empty = isl::union_map::empty(Space); Operations ---------- Before: set = isl_union_set_intersect(set, set2); After: Set = Set.intersect(Set2); The use of isl::boolean in return types also adds an increases the robustness of Polly, as on conversion to true or false, we verify that no isl_bool_error has been returned and assert in case an error was returned. Before this change we would have just ignored the error and proceeded with (some) exection path. Tags: #polly Reviewed By: Meinersbur Differential Revision: https://reviews.llvm.org/D30619 llvm-svn: 297466	2017-03-10 15:05:38 +00:00
Tobias Grosser	3cc57fa1e7	[unittest] Translate isl tests to C++ bindings For this translation we introduce two functions, valFromAPInt and APIntFromVal, to convert between isl::val and APInt. For now these are just proxies, but in the future they will replace the current isl_val* based conversion functions. The isl unit test cases benefit most from the new isl::boolean (from Michael Kruse), which can be explicitly casted to bool and which -- as part of this cast -- emits a check that no error condition has been triggered so far. This allows us to simplify EXPECT_EQ(isl_bool_true, isl_val_is_zero(IslZero)); to EXPECT_TRUE(IslZero.is_zero()); This simplification also becomes very clear in operator==, which changes from auto IsEqual = isl_set_is_equal(LHS.keep(), RHS.keep()); EXPECT_NE(isl_bool_error, IsEqual); return IsEqual; to just return bool(LHS.is_equal(RHS)); Some background for non-isl users. The isl C interface has an isl_bool type, which can be either true, false, or error. Hence, whenever a function returns a value of type isl_bool, an explicit error check should be considered. By using isl::boolean, we can just cast the isl::boolean to 'bool' or simply use the isl::boolean in a context where it will automatically be casted to bool (e.g., in an if-condition). When doing so, the C++ bindings automatically add code that verifies that the return value is not an error code. If it is, the program will warn about this and abort. For cases where errors are expected, isl::boolean provides checks such as boolean::is_true_or_error() or boolean::is_true_no_error() to explicitly control program behavior in case of error conditions. Thanks to the new automatic memory management, we also can avoid many calls to isl_*_free. For code that had previously been managed by IslPtr<>, many calls to give/take/copy are eliminated. Tags: #polly Reviewed By: Meinersbur Differential Revision: https://reviews.llvm.org/D30618 llvm-svn: 297464	2017-03-10 14:58:50 +00:00
Tobias Grosser	51ebda8c9d	[FlattenAlgo] Translate to C++ bindings Translate the full algorithm to use the new isl C++ bindings This is a large piece of code that has been written with the Polly IslPtr<> memory management tool, which only performed memory management, but did not provide a method interface. As such the code was littered with calls to give(), copy(), keep(), and take(). The diff of this change should give a good example how the new method interface simplifies the code by removing the need for switching between managed types and C functions all the time and consequently also the need to use the long C function names. These are a couple of examples comparing the old IslPtr memory management interface with the complete method interface. Check properties ---------------- Before: if (isl_aff_is_zero(Aff.get()) \|\| isl_aff_is_one(Aff.get())) return true; After: if (Aff.is_zero() \|\| Aff.is_one()) return true; Type conversion --------------- Before: isl_union_pw_multi_aff *UPMA = give(isl_union_pw_multi_aff_from_union_map(UMap.copy()); After: isl::union_pw_multi_aff UPMA = UMap; Type construction ----------------- Before: auto Empty = give(isl_union_map_empty(Space.copy()); After: auto Empty = isl::union_map::empty(Space); Operations ---------- Before: Set = give(isl_union_set_intersect(Set.copy(), Set2.copy()); After: Set = Set.intersect(Set2); Tags: #polly Reviewed By: Meinersbur Differential Revision: https://reviews.llvm.org/D30617 llvm-svn: 297463	2017-03-10 14:55:58 +00:00
Tobias Grosser	4c24e57965	Add method interface to isl C++ bindings The isl C++ binding method interface introduces a thin C++ layer that allows to call isl methods directly on the memory managed C++ objects. This makes the relevant methods directly available via code-completion interfaces, allows for the use of overloading, conversion constructors, and many other nice C++ features that make using isl a lot easier. The individual features will be highlighted in the subsequent commits. Tags: #polly Reviewed By: Meinersbur Differential Revision: https://reviews.llvm.org/D30616 llvm-svn: 297462	2017-03-10 14:53:00 +00:00
Tobias Grosser	deaef15f52	Introduce isl C++ bindings, Part 1: value_ptr style interface Over the last couple of months several authors of independent isl C++ bindings worked together to jointly design an official set of isl C++ bindings which combines their experience in developing isl C++ bindings. The new bindings have been designed around a value pointer style interface and remove the need for explicit pointer managenent and instead use C++ language features to manage isl objects. This commit introduces the smart-pointer part of the isl C++ bindings and replaces the current IslPtr<T> classes, which served the very same purpose, but had to be manually maintained. Instead, we now rely on automatically generated classes for each isl object, which provide value_ptr semantics. An isl object has the following smart pointer interface: inline set manage(__isl_take isl_set ptr); class set { friend inline set manage(__isl_take isl_set ptr); isl_set ptr = nullptr; inline explicit set(__isl_take isl_set ptr); public: inline set(); inline set(const set &obj); inline set &operator=(set obj); inline ~set(); inline __isl_give isl_set copy() const &; inline __isl_give isl_set copy() && = delete; inline __isl_keep isl_set get() const; inline __isl_give isl_set release(); inline bool is_null() const; } The interface and behavior of the new value pointer style classes is inspired by http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2012/n3339.pdf, which proposes a std::value_ptr, a smart pointer that applies value semantics to its pointee. We currently only provide a limited set of public constructors and instead require provide a global overloaded type constructor method "isl::obj isl::manage(isl_obj )", which allows to convert an isl_set to an isl::set by calling 'S = isl::manage(s)'. This pattern models the make_unique() constructor for unique pointers. The next two functions isl::obj::get() and isl::obj::release() are taken directly from the std::value_ptr proposal: S.get() extracts the raw pointer of the object managed by S. S.release() extracts the raw pointer of the object managed by S and sets the object in S to null. We additionally add std::obj::copy(). S.copy() returns a raw pointer refering to a copy of S, which is a shortcut for "isl::obj(oldobj).release()", a functionality commonly needed when interacting directly with the isl C interface where all methods marked with __isl_take require consumable raw pointers. S.is_null() checks if S manages a pointer or if the managed object is currently null. We add this function to provide a more explicit way to check if the pointer is empty compared to a direct conversion to bool. This commit also introduces a couple of polly-specific extensions that cover features currently not handled by the official isl C++ bindings draft, but which have been provided by IslPtr<T> and are consequently added to avoid code churn. These extensions include: - operator bool() : Conversion from objects to bool - construction from nullptr_t - get_ctx() method - take/keep/give methods, which match the currently used naming convention of IslPtr<T> in Polly. They just forward to (release/get/manage). - raw_ostream printers We expect that these extensions are over time either removed or upstreamed to the official isl bindings. We also export a couple of classes that have not yet been exported in isl (e.g., isl::space) As part of the code review, the following two questions were asked: - Why do we not use a standard smart pointer? std::value_ptr was a proposal that has not been accepted. It is consequently not available in the standard library. Even if it would be available, we want to expand this interface with a complete method interface that is conveniently available from each managed pointer. The most direct way to achieve this is to generate a specialiced value style pointer class for each isl object type and add any additional methods to this class. The relevant changes follow in subsequent commits. - Why do we not use templates or macros to avoid code duplication? It is certainly possible to use templates or macros, but as this code is auto-generated there is no need to make writing this code more efficient. Also, most of these classes will be specialized with individual member functions in subsequent commits, such that there will be little code reuse to exploit. Hence, we decided to do so at the moment. These bindings are not yet officially part of isl, but the draft is already very stable. The smart pointer interface itself did not change since serveral months. Adding this code to Polly is against our normal policy of only importing official isl code. In this case however, we make an exception to showcase a non-trivial use case of these bindings which should increase confidence in these bindings and will help upstreaming them to isl. Tags: #polly Reviewed By: Meinersbur Differential Revision: https://reviews.llvm.org/D30325 llvm-svn: 297452	2017-03-10 11:41:03 +00:00
Tobias Grosser	e5671e54c0	Update to isl-0.18-356-g0b05d01 This is a regular maintenance update. llvm-svn: 297449	2017-03-10 09:17:55 +00:00
Michael Kruse	0666a76aac	[Support] Correct filename in file head comment. NFC. llvm-svn: 297430	2017-03-10 00:36:54 +00:00
Michael Kruse	e4292bf086	[Support] Add -polly-dump-module pass. This pass allows writing the LLVM-IR just before and after the Polly passes to a file. Dumping the IR before Polly helps reproducing bugs that occur in code generated by clang. It is the only reliable way to get the IR that triggers a bug. The alternative is to emit the IR with clang -c -emit-llvm -S -o dump.ll then pass it through all optimization passes opt dump.ll -basicaa -sroa ... -S -o optdump.ll to then reproduce the error with opt optdump.ll -polly-opt-isl -polly-codegen -analyze However, the IR is not the same. -O3 uses a PassBuilder than creates passes with different parameters than the default. Dumping the IR after Polly is useful to compare a miscompilation with a known-good configuration. Differential Revision: https://reviews.llvm.org/D30788 llvm-svn: 297415	2017-03-09 22:29:58 +00:00
Michael Kruse	a9520b94d5	[Cmake] Generate a PollyConfig.cmake. Generate a PollyConfig.cmake for use with Cmake's find_package in out-of-tree projects. Contributed-by: Philip Pfaffe <philip.pfaffe@gmail.com> Differential Revision: https://reviews.llvm.org/D30495 llvm-svn: 297395	2017-03-09 17:58:20 +00:00
Tobias Grosser	8bd7f3c0a5	[ScopDetect/Info] Allow unconditional hoisting of loads from dereferenceable ptrs In case LLVM pointers are annotated with !dereferencable attributes/metadata or LLVM can look at the allocation from which a pointer is derived, we can know that dereferencing pointers is safe and can be done unconditionally. We use this information to proof certain pointers as save to hoist and then hoist them unconditionally. llvm-svn: 297375	2017-03-09 11:36:00 +00:00
Michael Kruse	9fb3ab1b19	[DeLICM] Add -polly-delicm-overapproximate-writes option. One of the current limitations of DeLICM is that it only creates PHI WRITEs that it knows are read by some PHI. Such writes may not span all instances of a statement. Polly's code generator currently does not support MemoryAccesses that are not executed in all instances ('partial accesses') and so has to give up on a possible mapping. This workaround has once been suggested by Tobias Grosser: Try to interpolate an arbitrary expansion to all instances. It will be checked for possible conflicts with the existing Knowledge and can be applied if the conflict checking result is that no semantics are changed. Expansion is done by simplifying the mapping by coalescing with the hope that coalescing will find a polyhedral 'rule' of the relevant map. It is then 'gist'-ed using the domain of the relevant instances such that the rule is expanded to the universe and finally intersected with the domain of all statement instances. The expansion makes conflicts become more likely, the found rule may still not encompass all statement instances and the found rule exposes internals of isl's implementation of coalesce and gist. The latter means that the result depends on how much effort the implementation invests into finding a rule which may change between versions of isl. Trivial implementations of gist and coalesce just return the input arguments. A patch that makes codegen support partial accesses is in preparation as well. Differential Revision: https://reviews.llvm.org/D30763 llvm-svn: 297373	2017-03-09 11:23:22 +00:00
Michael Kruse	935b2a3654	[DeadCodeElim] Put -polly-dce-precise-steps into the Polly category. llvm-svn: 297318	2017-03-08 23:25:35 +00:00
Michael Kruse	6744efa8d8	[ScopDetection] Only allow SCoP-wide available base pointers. Simplify ScopDetection::isInvariant(). Essentially deny everything that is defined within the SCoP and is not load-hoisted. The previous understanding of "invariant" has a few holes: - Expressions without side-effects with only invariant arguments, but are defined withing the SCoP's region with the exception of selects and PHIs. These should be part of the index expression derived by ScalarEvolution and not of the base pointer. - Function calls with that are !mayHaveSideEffects() (typically functions with "readnone nounwind" attributes). An example is given below. @C = external global i32 declare float* @getNextBasePtr(float) readnone nounwind ... %ptr = call float @getNextBasePtr(float* %A, float %B) The call might return: * %A, so %ptr aliases with it in the SCoP * %B, so %ptr aliases with it in the SCoP * @C, so %ptr aliases with it in the SCoP * a new pointer everytime it is called, such as malloc() * a pointer into the allocated block of one of the aforementioned * any of the above, at random at each call Hence and contrast to a comment in the base_pointer.ll regression test, %ptr is not necessarily the same all the time. It might also alias with anything and no AliasAnalysis can tell otherwise if the definition is external. It is hence not suitable in the role of a base pointer. The practical problem with base pointers defined in SCoP statements is that it is not available globally in the SCoP. The statement instance must be executed first before the base pointer can be used. This is no problem if the base pointer is transferred as a scalar value between statements. Uses of MemoryAccess::setNewAccessRelation may add a use of the base pointer anywhere in the array. setNewAccessRelation is used by JSONImporter, DeLICM and D28518. Indeed, BlockGenerator currently assumes that base pointers are available globally and generates invalid code for new access relation (referring to the base pointer of the original code) if not, even if the base pointer would be available in the statement. This could be fixed with some added complexity and restrictions. The ExprBuilder must lookup the local BBMap and code that call setNewAccessRelation must check whether the base pointer is available first. The code would still be incorrect in the presence of aliasing. There is the switch -polly-ignore-aliasing to explicitly allow this, but it is hardly a justification for the additional complexity. It would still be mostly useless because in most cases either getNextBasePtr() has external linkage in which case the readnone nounwind attributes cannot be derived in the translation unit itself, or is defined in the same translation unit and gets inlined. Reviewed By: grosser Differential Revision: https://reviews.llvm.org/D30695 llvm-svn: 297281	2017-03-08 15:14:46 +00:00
Michael Kruse	5a4ec5c42b	[ScopDetection] Require LoadInst base pointers to be hoisted. Only when load-hoisted we can be sure the base pointer is invariant during the SCoP's execution. Most of the time it would be added to the required hoists for the alias checks anyway, except with -polly-ignore-aliasing, -polly-use-runtime-alias-checks=0 or if AliasAnalysis is already sure it doesn't alias with anything (for instance if there is no other pointer to alias with). Two more parts in Polly assume that this load-hoisting took place: - setNewAccessRelation() which contains an assert which tests this. - BlockGenerator which would use to the base ptr from the original code if not load-hoisted (if the access expression is regenerated) Differential Revision: https://reviews.llvm.org/D30694 llvm-svn: 297195	2017-03-07 20:28:43 +00:00
Tobias Grosser	a0b85963ba	Update isl to isl-0.18-336-g1e193d9 This is a regular maintenance update llvm-svn: 297169	2017-03-07 17:53:34 +00:00
Tobias Grosser	6c9958e0b3	[tests] Make sure tests do not end in 'unreachable' - Part III There is no point in optimizing unreachable code, hence our test cases should always return. This commit is part of a series that makes Polly more robust on the presence of unreachables. llvm-svn: 297158	2017-03-07 16:28:53 +00:00
Tobias Grosser	2d233fb35d	[tests] Update bounds-check elimination test cases These test cases should work in combination with https://reviews.llvm.org/D12676, but became outdated over time. Update them in preparation of discussions with Daniel Berlin on how to represent unreachable in the post-dominator tree. llvm-svn: 297157	2017-03-07 16:17:58 +00:00
Tobias Grosser	ce69e7b593	[ScopInfo] Avoid infinite loop during schedule construction Our current scop modeling enters an infinite loop when trying to model code that has unreachable instructions (e.g., test/ScopInfo/BoundChecks/single-loop.ll), as the number of basic blocks returned by the LLVM Loop* does not include unreachable basic blocks that branch off from the core loop body. This arises for example in the following piece of code: for (i = 0; i < N; i++) { if (i > 1024) abort(); <- this abort might be translated to an unreachable A[i] = ... } This patch adds these unreachable basic blocks in our per loop basic block count to ensure that the schedule construction does not assume a loop has been processed completely, despite certain unreachable basic blocks still remaining. The infinite loop is only observable in combination with https://reviews.llvm.org/D12676 or a similar patch. llvm-svn: 297156	2017-03-07 16:17:55 +00:00
Tobias Grosser	134a572951	[ScopDetection] Do not detect scops that exit to an unreachable Scops that exit with an unreachable are today still permitted, but make little sense to optimize. We therefore can already skip them during scop detection. This speeds up scop detection in certain cases and also ensures that bugpoint does not introduce unreachables when reducing test cases. In practice this change should have little impact, as the performance of unreachable code is unlikely to matter. This commit is part of a series that makes Polly more robust in the presence of unreachables. llvm-svn: 297151	2017-03-07 15:50:43 +00:00
Tobias Grosser	87dcd46aa7	[tests] Make sure tests do not end in 'unreachable' - Part II There is no point in optimizing unreachable code, hence our test cases should always return. This commit is part of a series that makes Polly more robust on the presence of unreachables. llvm-svn: 297150	2017-03-07 15:23:30 +00:00
Tobias Grosser	2dc1f547ae	[tests] Make sure tests do not end in 'unreachable' There is no point in optimizing unreachable code, hence our test cases should always return. This commit is part of a series that makes Polly more robust on the presence of unreachables. llvm-svn: 297147	2017-03-07 15:17:23 +00:00
Sanjoy Das	b641a90529	Adapt to llvm change r296992 to unbreak the bots r296992 made ScalarEvolution's CompareValueComplexity less aggressive, and that broke the polly test being fixed in this change. This change explicitly bumps CompareValueComplexity in said test case to make it pass. Can someone from the polly team please can give me an idea on if this case is important enough to have scalar-evolution-max-value-compare-depth be 3 by default? llvm-svn: 296994	2017-03-06 01:12:16 +00:00
Tobias Grosser	7d136d952e	[tests] Specify the dependence to NVPTX backend for Polly ACC test cases Some Polly ACC test cases fail without a working NVPTX backend. We explicitly specify this dependence in REQUIRES. Alternatively, we could have only marked polly-acc as supported in case the NVPTX backend is available, but as we might use other backends in the future, this does not seem to be the best choice. For this to work, we also need to make the 'targets_to_build' information available. Suggested-by: Michael Kruse <llvm@meinersbur.de> llvm-svn: 296853	2017-03-03 03:38:50 +00:00
Tobias Grosser	9d551da5c1	[test] Do not emit binary data to output Suggested-by: Michael Kruse <llvm@meinersbur.de> llvm-svn: 296852	2017-03-03 03:24:34 +00:00
Tobias Grosser	7a93d94a8f	Revert "Currently broken by recent LLVM upstream changes" This reverts commit r296579, which is not needed anymore as the relevant changes in trunk have been reverted. llvm-svn: 296817	2017-03-02 21:43:50 +00:00
Tobias Grosser	1c787e0b49	[ScopDetection] Do not allow required-invariant loads in non-affine region These loads cannot be savely hoisted as the condition guarding the non-affine region cannot be duplicated to also protect the hoisted load later on. Today they are dropped in ScopInfo. By checking for this early, we do not even try to model them and possibly can still optimize smaller regions not containing this specific required-invariant load. llvm-svn: 296744	2017-03-02 12:15:37 +00:00
Tobias Grosser	c2f151084d	[ScopInfo] Disable memory folding in case it results in multi-disjunct relations Multi-disjunct access maps can easily result in inbound assumptions which explode in case of many memory accesses and many parameters. This change reduces compilation time of some larger kernel from over 15 minutes to less than 16 seconds. Interesting is the test case test/ScopInfo/multidim_param_in_subscript.ll which has a memory access [n] -> { Stmt_for_body3[i0, i1] -> MemRef_A[i0, -1 + n - i1] } which requires folding, but where only a single disjunct remains. We can still model this test case even when only using limited memory folding. For people only reading commit messages, here the comment that explains what memory folding is: To recover memory accesses with array size parameters in the subscript expression we post-process the delinearization results. We would normally recover from an access A[exp0(i) * N + exp1(i)] into an array A[][N] the 2D access A[exp0(i)][exp1(i)]. However, another valid delinearization is A[exp0(i) - 1][exp1(i) + N] which - depending on the range of exp1(i) - may be preferrable. Specifically, for cases where we know exp1(i) is negative, we want to choose the latter expression. As we commonly do not have any information about the range of exp1(i), we do not choose one of the two options, but instead create a piecewise access function that adds the (-1, N) offsets as soon as exp1(i) becomes negative. For a 2D array such an access function is created by applying the piecewise map: [i,j] -> [i, j] : j >= 0 [i,j] -> [i-1, j+N] : j < 0 After this patch we generate only the first case, except for situations where we can proove the first case to be invalid and can consequently select the second without introducing disjuncts. llvm-svn: 296679	2017-03-01 21:11:27 +00:00
Tobias Grosser	24222c7357	Fix namespaces after clang-format update llvm-svn: 296635	2017-03-01 15:54:27 +00:00
Tobias Grosser	6f9b60cf38	Currently broken by recent LLVM upstream changes We mark it as XFAIL to get buildbots back to green, until the upstream changes have been addressed. llvm-svn: 296579	2017-03-01 04:34:44 +00:00
Tobias Grosser	d7c4975349	[ScopInfo] Simplify inbounds assumptions under domain constraints Without this simplification for a loop nest: void foo(long n1_a, long n1_b, long n1_c, long n1_d, long p1_b, long p1_c, long p1_d, float A_1[][p1_b][p1_c][p1_d]) { for (long i = 0; i < n1_a; i++) for (long j = 0; j < n1_b; j++) for (long k = 0; k < n1_c; k++) for (long l = 0; l < n1_d; l++) A_1[i][j][k][l] += i + j + k + l; } the assumption: n1_a <= 0 or (n1_a > 0 and n1_b <= 0) or (n1_a > 0 and n1_b > 0 and n1_c <= 0) or (n1_a > 0 and n1_b > 0 and n1_c > 0 and n1_d <= 0) or (n1_a > 0 and n1_b > 0 and n1_c > 0 and n1_d > 0 and p1_b >= n1_b and p1_c >= n1_c and p1_d >= n1_d) is taken rather than the simpler assumption: p9_b >= n9_b and p9_c >= n9_c and p9_d >= n9_d. The former is less strict, as it allows arbitrary values of p1_* in case, the loop is not executed at all. However, in practice these precise constraints explode when combined across different accesses and loops. For now it seems to make more sense to take less precise, but more scalable constraints by default. In case we find a practical example where more precise constraints are needed, we can think about allowing such precise constraints in specific situations where they help. This change speeds up the new test case from taking very long (waited at least a minute, but it probably takes a lot more) to below a second. llvm-svn: 296456	2017-02-28 09:45:54 +00:00
Tobias Grosser	cf66ea3845	Update isl to isl-0.18-304-g1efe43d This is a normal maintenance update. llvm-svn: 296441	2017-02-28 07:06:06 +00:00
Michael Kruse	6469380daa	[Cmake] Optionally use a system isl version. This patch adds an option to build against a version of libisl already installed on the system. The installation is autodetected using the pkg-config file shipped with isl. The detection of the library is in the FindISL.cmake module that creates an imported target. Contributed-by: Philip Pfaffe <philip.pfaffe@gmail.com> Differential Revision: https://reviews.llvm.org/D30043 llvm-svn: 296361	2017-02-27 17:54:25 +00:00
Michael Kruse	c4f61d2346	[DeLICM] Add nomap regressions tests. NFC. These verify that some scalars are not mapped because it would be incorrect to do so. For these check we verify that no transformation has been executed from output of the pass's '-analyze'. Adding optimization remarks is not useful as it would result in too many messages, even repeated ones. I avoided checking the '-debug-only=polly-delicm' output which is an antipattern. llvm-svn: 296348	2017-02-27 15:53:18 +00:00
Michael Kruse	b295c37a15	[DeLICM] Statistics for use in regression tests. Print some measurements of the DeLICM transformation at -analyze to be used in regression tests. llvm-svn: 296347	2017-02-27 15:53:13 +00:00
Roman Gareev	bc3fbe49c5	Disable the parallel code generation in case of extension nodes We can not perform the dependence analysis and, consequently, the parallel code generation in case the schedule tree contains extension nodes. Reviewed-by: Tobias Grosser <tobias@grosser.es> Differential Revision: https://reviews.llvm.org/D30394 llvm-svn: 296325	2017-02-27 08:03:11 +00:00
Michael Kruse	e199f285b0	[DeLICM] Fortify against exceeding isl's max operations counter. Control flow would flow-through after the check whether the operations quota exceeded, with the intention that it would later be caught by Knowledge::isUsable(). However, the Knowledge constructor has its own assertions to check consistency which would fail if its fields have only been initialized partially because some sets have been computed correctly before the operations quota takes effect. Fix by erroring-out early instead of falling-throught into the code that might expect that everything has been computed correctly. For robustness, also bail-out if any of the fields contain nullptr values instead of relying on isl always setting exactly this error code if something went wrong. This should fix the perf-x86_64-penryn-O3-polly-before-vectorizer-unprofitable (-polly-process-unprofitable -polly-position=before-vectorizer -polly-enable-delicm) buildbot. llvm-svn: 296022	2017-02-23 21:58:20 +00:00
Michael Kruse	f4e201e09f	[Support] Remove NonowningIslPtr. NFC. NonowningIslPtr<isl_X> was used as types of function parameters when the function does not consume the isl object, i.e. an __isl_keep parameter. The alternatives are: 1. IslPtr<isl_X> This has additional calls to isl_X_copy and isl_X_free to increase/decrease the reference counter even though not needed. The caller already owns a reference to the isl object. 2. const IslPtr<isl_X>& This does not change the reference counter, but requires an additional load to get the pointer to the isl object (instead of just passing the pointer itself). Moreover, the compiler cannot rely on the constness of the pointer and has to reload the pointer every time it writes to memory (unless alias analysis such as TBAA says it is not possible). The isl C++ bindings currently in development do not have an equivalent to NonowningIslPtr and adding one would make the binding more complicated and its advantage in performance is small. In order to simplify the transition to these C++ bindings, remove NonowningIslPtr. Change every former use of it to alternative 2 mentioned aboce (const IslPtr<isl_X>&). llvm-svn: 295998	2017-02-23 17:57:27 +00:00
Michael Kruse	2c7169d00c	[DependenceInfo] Remove unused variable. NFC. llvm-svn: 295987	2017-02-23 15:41:01 +00:00
Michael Kruse	dd6f29375b	[DependenceInfo] Use references instead of double pointers. NFC. Non-const references are the more C++-ish way to modify a variable passed by the caller. llvm-svn: 295986	2017-02-23 15:40:56 +00:00
Michael Kruse	ec8fc32160	[DependenceInfo] Rename StmtScheduleDomain -> TaggedStmtDomain. NFC. llvm-svn: 295985	2017-02-23 15:40:52 +00:00
Michael Kruse	00c38e0df2	[DependenceInfo] Simplify use of StmtSchedule's domain [NFC] Once a StmtSchedule is created, only its domain is used anywhere within DependenceInfo::calculateDependences. So, we choose to return the wrapped domain of the union_map rather than the entire union_map. However, we still build the union_map first within collectInfo(). It is cleaner to first build the entire union_map and then pull the domain out in one shot, rather than repeatedly extracting the domain in bits and pieces from accdom. Contributed-by: Siddharth Bhat <siddu.druid@gmail.com> Differential Revision: https://reviews.llvm.org/D30208 llvm-svn: 295984	2017-02-23 15:40:46 +00:00
Michael Kruse	52ab4943b4	Remove all references to PostDominators. NFC. Marking a pass as preserved is necessary if any Polly pass uses it, even if it is not preserved within the generated code. Not marking it would cause the the Polly pass chain to be interrupted. It is not used by any Polly pass anymore, hence we can remove all references to it. llvm-svn: 295983	2017-02-23 15:16:22 +00:00
Michael Kruse	9f519714b3	[DeLICM] Add missing Doxygen comment. NFC. llvm-svn: 295978	2017-02-23 14:51:50 +00:00
Michael Kruse	311ecb00dc	[DeLICM] Capitalize parameter name. NFC. llvm-svn: 295977	2017-02-23 14:51:45 +00:00
Tobias Grosser	59d23bbdc6	Update isl to isl-0.18-282-g12465a5 Besides a variety of smaller cleanups, this update also contains a correctness fix to isl coalesce which resolves a crash in Polly. llvm-svn: 295966	2017-02-23 12:48:42 +00:00
Roman Gareev	96e1119a96	Make optimizations based on pattern matching be enabled by default Currently, pattern based optimizations of Polly can identify matrix multiplication and optimize it according to BLIS matmul optimization pattern (see ScheduleTreeOptimizer for details). This patch makes optimizations based on pattern matching be enabled by default. Reviewed-by: Tobias Grosser <tobias@grosser.es> Differential Revision: https://reviews.llvm.org/D30293 llvm-svn: 295958	2017-02-23 11:44:12 +00:00
Michael Kruse	d8d32bb3d1	[DeLICM] Regression test for skipping map targets. Add optimization-remarks-missed for when mapping targets have been skipped and add regression tests for them. llvm-svn: 295953	2017-02-23 10:25:20 +00:00
Michael Kruse	deb30e8278	[DeLICM] Add regression tests for DeLICM reject cases. These tests were not included in the main DeLICM commit. These check the cases where zone analysis cannot be successful because of assumption violations. We use the LLVM optimization remark infrastructure as it seems to be the best fit for this kind of messages. I tried to make use if the OptimizationRemarkEmitter. However, it would insert additional function passes into the pass manager to get the hotness information. The pass manager would insert them between the flatten pass and delicm, causing the ScopInfo with the flattened schedule being thrown away. Differential Revision: https://reviews.llvm.org/D30253 llvm-svn: 295846	2017-02-22 15:14:08 +00:00
Michael Kruse	8474470500	[DeLICM] Fix wrong comment. NFC. Correct a comment that claimed that a store after load was detected when the code checks a load after a store. llvm-svn: 295835	2017-02-22 14:14:40 +00:00
Michael Kruse	43ed25f1d9	[DeLICM] Print message when zone analysis is not available on -analysis. This is to distinguish the cases that analysis has failed from the case where not transformation was performed. llvm-svn: 295833	2017-02-22 13:48:35 +00:00
Michael Kruse	91cdafb86f	[DeLICM] Use opt<int>. There is no template specialization for cl::parser<unsigned long> such that parsing an cl::opt<unsigned long> command line argument will fail. Use opt<int> instead which has an associated parser. llvm-svn: 295832	2017-02-22 13:48:18 +00:00
Tobias Grosser	cc43087afc	[DependenceInfo] Simplify creation and subsequent use of AccessSchedule [NFC] We only ever use the wrapped domain of AccessSchedule, so stop creating an entire union_map and then pulling the domain out. Reviewers: grosser Tags: #polly Contributed-by: Siddharth Bhat <siddu.druid@gmail.com> Differential Revision: https://reviews.llvm.org/D30179 llvm-svn: 295726	2017-02-21 15:38:31 +00:00
Michael Kruse	9e52c39f0a	[DeLICM] Map values hoisted by LICM back to the array. Implement the -polly-delicm pass. The pass intends to undo the effects of LoopInvariantCodeMotion (LICM) which adds additional scalar dependencies into SCoPs. DeLICM will try to map those scalars back to the array elements they were promoted from, as long as the array element is unused. The is the main patch from the DeLICM/DePRE patch series. It does not yet undo GVN PRE for which additional information about known values is needed and does not handle PHI write accesses that have have no target. As such its usefulness is limited. Patches for these issues including regression tests for error situatons will follow. Reviewers: grosser Differential Revision: https://reviews.llvm.org/D24716 llvm-svn: 295713	2017-02-21 10:20:54 +00:00
Michael Kruse	d9cdeb453d	[Cmake] Bump required cmake version to 3.4.3. This is currently the minimum required version by LLVM. Since LLVM is needed to build Polly, we also require at least that version. Suggested-by: Philip Pfaffe <philip.pfaffe@gmail.com> llvm-svn: 295672	2017-02-20 17:06:31 +00:00
Michael Kruse	5ab24fdb73	[Cmake] Install the isl headers into the install tree. isl headers are currently missing in a Polly installation. Because the Polly headers depend on those, code can't be compiled against an installed Polly. This patch installs the isl headers. I left a TODO, as optionally it should be possible to use a system version of isl instead of the one shipped with Polly. When compiling, clients of the installation need to add -I${PREFIX}/include/polly/ to there include path right now, because there currently is no way to export this path automatically. Contributed-by: Philip Pfaffe <philip.pfaffe@gmail.com> Differential Revision: https://reviews.llvm.org/D29931 llvm-svn: 295671	2017-02-20 16:57:14 +00:00
Tobias Grosser	079d511891	[ScopInfo] Count read-only arrays when computing complexity of alias check Instead of counting the number of read-only accesses, we now count the number of distinct read-only array references when checking if a run-time alias check may be too complex. The run-time alias check is quadratic in the number of base pointers, not the number of accesses. Before this change we accidentally skipped SPEC's lbm test case. llvm-svn: 295567	2017-02-18 20:51:29 +00:00
Tobias Grosser	28492b85e2	[DependenceInfo] Pull out statement [NFC] This simplifies the code slightly. llvm-svn: 295551	2017-02-18 16:41:28 +00:00
Tobias Grosser	8ee46985d2	[Dependences] Compute reduction dependences on schedule tree [NFC] This change gets rid of the need for zero padding, makes the reduction computation code more similar to the normal dependence computation, and also better documents what we do at the moment. Making the dependence computation for reductions a little bit easier to understand will hopefully help us to further reduce code duplication. This reduces the time spent only in the reduction dependence pass from 260ms to 150ms for test/DependenceInfo/reduction_sequence.ll. This is a reduction of over 40% in dependence computation time. This change was inspired by discussions with Michael Kruse, Utpal Bora, Siddharth Bhat, and Johannes Doerfert. It can hopefully lay the base for further cleanups of the reduction code. llvm-svn: 295550	2017-02-18 16:39:04 +00:00
Tobias Grosser	41f0d81b31	[test] Add reduction sequence test case [NFC] This test case is a mini performance test case that shows the time needed for a couple of simple reductions. It takes today about 325ms on my machine to run this test case through 'opt' with scop construction and reduction detection. It can be used as mini-proxy for further tuning of the reduction code. Generally we do not commit performance test cases, but as this is very small and also very fast it seems OK to keep it in the lit test suite. This test case will also help to verify that future changes to the reduction code will not affect the ordering of the reduction sets and will consequently not cause spurious performance changes that only result from reordering of dependences in the reduction set. llvm-svn: 295549	2017-02-18 16:38:58 +00:00
Tobias Grosser	2461021150	Drop leftover debug statement llvm-svn: 295444	2017-02-17 13:39:45 +00:00
Tobias Grosser	cd01a363d6	[ScopInfo] Add statistics to count loops after scop modeling llvm-svn: 295431	2017-02-17 08:12:36 +00:00
Tobias Grosser	65ce9362b8	[ScopDetection] Compute the maximal loop depth correctly Before this change, we obtained loop depth numbers that were deeper then the actual loop depth. llvm-svn: 295430	2017-02-17 08:08:54 +00:00
Tobias Grosser	72745c2ef5	Updated isl to isl-0.18-254-g6bc184d This update includes a couple more coalescing changes as well as a large number of isl-internal code cleanups (dead assigments, ...). llvm-svn: 295419	2017-02-17 05:11:16 +00:00
Tobias Grosser	ca2cfd0bd8	[ScopInfo] Do not try to fold array dimensions of size zero Trying to fold such kind of dimensions will result in a division by zero, which crashes the compiler. As such arrays are likely to invalidate the scop anyhow (but are not illegal in LLVM-IR), there is no point in trying to optimize the array layout. Hence, we just avoid the folding of constant dimensions of size zero. llvm-svn: 295415	2017-02-17 04:48:52 +00:00
Tobias Grosser	90411a967b	[ScopInfo] Rename MaxDisjunctions -> MaxDisjuncts [NFC] There is only a single disjunction. However, we bound the number of 'disjuncts' in this disjunction. Name the variable accordingly. llvm-svn: 295362	2017-02-16 19:11:33 +00:00
Tobias Grosser	76ec194951	[tests] Fix some misspellings [NFC] llvm-svn: 295361	2017-02-16 19:11:29 +00:00
Tobias Grosser	c8a8276710	[ScopInfo] Bound the number of disjuncts in context Before this change wrapping range metadata resulted in exponential growth of the context, which made context construction of large scops very slow. Instead, we now just do not model the range information precisely, in case the number of disjuncts in the context has already reached a certain limit. llvm-svn: 295360	2017-02-16 19:11:25 +00:00

... 2 3 4 5 6 ...

3241 Commits