llvm-project

Commit Graph

Author	SHA1	Message	Date
Tobias Grosser	4b5e24df9a	cmake: avoid "zero-length gnu_printf format string" warning in gcc 6.1.1 Contributed-by: Andy Gibbs <andyg1001@hotmail.co.uk> llvm-svn: 284302	2016-10-15 05:08:12 +00:00
Michael Kruse	fa53c86dc1	[ScopInfo/CodeGen] ExitPHI reads are implicit. Under some conditions MK_Value read accessed where converted to MK_ExitPHI read accessed. This is unexpected because MK_ExitPHI read accesses are implicit after the scop execution. This behaviour was introduced in r265261, which fixed a failed assertion/crash in CodeGen. Instead, we fix this failure in CodeGen itself. createExitPHINodeMerges(), despite its name, also handles accesses of kind MK_Value, only to skip them because they access values that are usually not PHI nodes in the SCoP region's exit block. Except in the situation observed in r265261. Do not convert value accessed to ExitPHI accesses and do not handle value accesses like ExitPHI accessed in CodeGen anymore. llvm-svn: 284023	2016-10-12 16:31:09 +00:00
Michael Kruse	c9edc2ee8d	[DepInfo] Print -debug output outside of max-operations scope. ISL tries to simplify the polyhedral operations before printing its objects. This increases the operations counter and therefore can contribute to hitting the operations limit. Therefore the result could be different when -debug output is enabled, making debugging harder. llvm-svn: 283745	2016-10-10 11:45:59 +00:00
Michael Kruse	8bfba1ff46	[Support/DepInfo] Introduce IslMaxOperationsGuard and make DepInfo use it. NFC. IslMaxOperationsGuard defines a scope where ISL may abort operations because if it takes too many operations. Replace the call to the raw ISL interface by a use of the guard. IslMaxOperationsGuard provides a uniform way to define a maximal computation time for a code region in C++ using RAII. llvm-svn: 283744	2016-10-10 11:45:54 +00:00
Tobias Grosser	b270288752	Fix formatting after recent cl:: changes This fixes 'make check-polly' llvm-svn: 283693	2016-10-09 08:31:35 +00:00
Mehdi Amini	732afdd09a	Turn cl::values() (for enum) from a vararg function to using C++ variadic template The core of the change is supposed to be NFC, however it also fixes what I believe was an undefined behavior when calling: va_start(ValueArgs, Desc); with Desc being a StringRef. Differential Revision: https://reviews.llvm.org/D25342 llvm-svn: 283671	2016-10-08 19:41:06 +00:00
Hongbin Zheng	5860aef675	Define PATH_MAX on windows Differential Revision: https://reviews.llvm.org/D25372 llvm-svn: 283600	2016-10-07 20:58:20 +00:00
Michael Kruse	de42b43de0	[cmake] Unify disabling upstream project warnings. Handle MSVC, ISL and PPCG in one place. The only functional change is that warnings are also disabled for MSVC compiling PPCG (Which currently fails anyway). llvm-svn: 283547	2016-10-07 12:38:32 +00:00
Michael Kruse	4b5f6af2dc	[cmake] Move isl_test artifacts to Polly folder. Folders in Visual Studio solutions help organize the build artifacts from all LLVM projects. There is a folder to keep Polly-built files in. llvm-svn: 283546	2016-10-07 12:38:24 +00:00
Tobias Grosser	e84ee850d1	Build and run isl_test as part of check-polly Running isl tests is important to gain confidence that the isl build we created works as expected. Besides the actual isl tests, there are also isl AST generation tests shipped with isl. This change only adds support for the isl unit tests. AST generation test support is left for a later commit. There is a choice to run tests directly through the build system or in the context of lit. We choose to run tests as part of lit to as this allows us to easily set environment variables, print output only on error and generally run the tests directly from the lit command. Reviewers: brad.king, Meinersbur Subscribers: modocache, brad.king, pollydev, beanz, llvm-commits, mgorny Differential Revision: https://reviews.llvm.org/D25155 llvm-svn: 283245	2016-10-04 19:48:40 +00:00
Michael Kruse	6ab4476835	[ScopInfo] Add -polly-unprofitable-scalar-accs option. With this option one can disable the heuristic that assumes that statements with a scalar write access cannot be profitably optimized. Such a statement instances necessarily have WAW-dependences to itself. With DeLICM scalar accesses can be changed to array accesses, which can avoid these WAW-dependence. llvm-svn: 283233	2016-10-04 17:33:39 +00:00
Michael Kruse	ca7cbcca37	[ScopInfo] Scalar access do not have indirect base pointers. ScopArrayInfo used to determine base pointer origins by looking up whether the base pointer is a load. The "base pointer" for scalar accesses is the llvm::Value being accessed. This is only a symbolic base pointer, it represents the alloca variable (.s2a or .phiops) generated for it at code generation. This patch disables determining base pointer origin for scalars. A test case where this caused a crash will be added in the next commit. In that test SAI tried to get the origin base pointer that was only declared later, therefore not existing. This is probably only possible for scalars used in PHINode incoming blocks. llvm-svn: 283232	2016-10-04 17:33:34 +00:00
Michael Kruse	9116899615	[ScopInfo] Make simplifySCoP() public. NFC. This function may need to be called after the scop construction. The upcoming DeLICM will use this to cleanup statement that all write accesses have been removed from. llvm-svn: 283221	2016-10-04 14:14:33 +00:00
Tobias Grosser	5652c9c830	isl: update to isl-0.17.1-233-gc911e6a llvm-svn: 283049	2016-10-01 19:46:51 +00:00
Michael Kruse	4b0c5aea78	[CodeGen] Add assertion for indirect array index expression generation. NFC. Currently Polly cannot generate code for index expressions if the base pointer is computed within the scop. The base pointer must be generated as well, but there is no code that triggers that. Add an assertion to detect when this would occur and miscompile. The IR verifier should catch it as well. llvm-svn: 282893	2016-09-30 18:29:37 +00:00
Michael Kruse	f6f795e9ae	[Support] Complete ISL annotations to IslPtr<>. NFC. Add missing __isl_(give/take/keep) annotations to IslPtr<> and NonowningIslPtr<> methods. Because IslPtr's constructor's annotation would depend on the TakeOwnership parameter, the parameter has been removed. Caller must copy the object themselves if the do not want to take ownership. llvm-svn: 282883	2016-09-30 17:47:39 +00:00
Michael Kruse	51f514d853	[Support] Compile fix for gcc. NFC. gcc 5.4 insists on template specialization to be in a namespace polly { ... } block, instead of being prefixed with 'polly::'. Error message: root/src/llvm/tools/polly/lib/Support/GICHelper.cpp:203:54: error: specialization of ‘template<class T> void polly::IslPtr<T>::dump() const’ in different namespace [-fpermissive] template <> void polly::IslPtr<isl_##TYPE>::dump() const { \ ^ msvc14 and clang 3.8 did not complain. llvm-svn: 282874	2016-09-30 16:47:43 +00:00
Michael Kruse	55519dad62	[Support] Add (Nonowning-)IslPtr::dump(). NFC. The dump() methods can be called from a debugger instead of e.g. isl_*_dump(Var.Obj) where Var is a variable of type IslPtr/NonowningIslPtr. To ensure that the existence of the function pointers do not depdend on whether the methods are used somwhere, they are declared with external linkage. llvm-svn: 282870	2016-09-30 16:10:19 +00:00
Michael Kruse	32312d0294	[Support] Call isl__free() only on non-null pointers. NFC. Add a non-NULL check before calling the free function into functions that are supposed to be inlined. First, this is a form of partial inlining of the free function, namely the nullptr test that free has to do. Secondly, and more importantly, it allows the compiler to remove the call to isl__free() when it knows that the object is nullptr, for instance because the last call is a take(). "Consuming" the last use of an ISL object using take() (instead of copy()) is a common pattern. llvm-svn: 282864	2016-09-30 15:29:43 +00:00
Michael Kruse	888ab55140	[CodeGen] Change 'Scalar' to 'Array' in method names. NFC. generateScalarLoad() and generateScalarStore() are used for explicit (MK_Array) memory accesses, therefore the method names were misleading. The names also were similar to generateScalarLoads() and generateScalarStores() (plural forms) which indeed handle scalar accesses. Presumbly, they were originally named to contrast VectorBlockGenerator::generateLoad(). Rename the two methods to generateArrayLoad(), respectively generateArrayStore(). llvm-svn: 282861	2016-09-30 14:34:05 +00:00
Michael Kruse	77394f1394	[CodeGen] Add assertion for partial scalar accesses. NFC. The code generator always adds unconditional LoadInst and StoreInst, hence the MemoryAccess must be defined over all statement instances. llvm-svn: 282853	2016-09-30 14:01:46 +00:00
Tobias Grosser	b110296011	www: Add Loopy publication llvm-svn: 282747	2016-09-29 18:17:30 +00:00
Tobias Grosser	ae91bc7353	www: add new code coverage link to Polly website llvm-svn: 282351	2016-09-25 08:03:38 +00:00
Tobias Grosser	349d1c3368	[ScopDetection] Remove redundant checks for endless loops Summary: Both `canUseISLTripCount()` and `addOverApproximatedRegion()` contained checks to reject endless loops which are now removed and replaced by a single check in `isValidLoop()`. For reporting such loops the `ReportLoopOverlapWithNonAffineSubRegion` is renamed to `ReportLoopHasNoExit`. The test case `ReportLoopOverlapWithNonAffineSubRegion.ll` is adapted and renamed as well. The schedule generation in `buildSchedule()` is based on the following assumption: Given some block B that is contained in a loop L and a SESE region R, we assume that L is contained in R or the other way around. However, this assumption is broken in the presence of endless loops that are nested inside other loops. Therefore, in order to prevent erroneous behavior in `buildSchedule()`, r265280 introduced a corresponding check in `canUseISLTripCount()` to reject endless loops. Unfortunately, it was possible to bypass this check with -polly-allow-nonaffine-loops which was fixed by adding another check to reject endless loops in `allowOverApproximatedRegion()` in r273905. Hence there existed two separate locations that handled this case. Thank you Johannes Doerfert for helping to provide the above background information. Reviewers: Meinersbur, grosser Subscribers: _jdoerfert, pollydev Differential Revision: https://reviews.llvm.org/D24560 Contributed-by: Matthias Reisinger <d412vv1n@gmail.com> llvm-svn: 281987	2016-09-20 17:05:22 +00:00
Tobias Grosser	122d6d74f6	Fix spelling in CMakeLists llvm-svn: 281897	2016-09-19 10:55:31 +00:00
Tobias Grosser	05ee64e67a	GPGPU: add missing REQUIRES line to test case llvm-svn: 281850	2016-09-18 08:57:38 +00:00
Tobias Grosser	bc653f2031	GPGPU: Do not run mostly sequential kernels in GPU In case sequential kernels are found deeper in the loop tree than any parallel kernel, the overall scop is probably mostly sequential. Hence, run it on the CPU. llvm-svn: 281849	2016-09-18 08:31:09 +00:00
Tobias Grosser	82f2af3508	GPGPU: Dynamically ensure 'sufficient compute' Offloading to a GPU is only beneficial if there is a sufficient amount of compute that can be accelerated. Many kernels just have a very small number of dynamic compute, which means GPU acceleration is not beneficial. We compute at run-time an approximation of how many dynamic instructions will be executed and fall back to CPU code in case this number is not sufficiently large. To keep the run-time checking code simple, we over-approximate the number of instructions executed in each statement by computing the volume of the rectangular hull of its iteration space. llvm-svn: 281848	2016-09-18 06:50:35 +00:00
Tobias Grosser	cfdee6582b	GPGPU: Make test cases independent of register numbering [NFC] llvm-svn: 281847	2016-09-18 06:50:28 +00:00
Tobias Grosser	51dfc27589	GPGPU: Store back non-read-only scalars We may generate GPU kernels that store into scalars in case we run some sequential code on the GPU because the remaining data is expected to already be on the GPU. For these kernels it is important to not keep the scalar values in thread-local registers, but to store them back to the corresponding device memory objects that backs them up. We currently only store scalars back at the end of a kernel. This is only correct if precisely one thread is executed. In case more than one thread may be run, we currently invalidate the scop. To support such cases correctly, we would need to always load and store back from a corresponding global memory slot instead of a thread-local alloca slot. llvm-svn: 281838	2016-09-17 19:22:31 +00:00
Tobias Grosser	fe74a7a1f5	GPGPU: Detect read-only scalar arrays ... and pass these by value rather than by reference. llvm-svn: 281837	2016-09-17 19:22:18 +00:00
Tobias Grosser	8f86a47461	Update CFGPrinter -> CFGPrinterLegacyPass .. to match recent changes in LLVM that broke the Polly compilation. llvm-svn: 281705	2016-09-16 05:48:09 +00:00
Tobias Grosser	aaabbbf886	GPGPU: Do not assume arrays start at 0 Our alias checks precisely check that the minimal and maximal accessed elements do not overlap in a kernel. Hence, we must ensure that our host <-> device transfers do not touch additional memory locations that are not covered in the alias check. To ensure this, we make sure that the data we copy for a given array is only the data from the smallest element accessed to the largest element accessed. We also adjust the size of the array according to the offset at which the array is actually accessed. An interesting result of this is: In case array are accessed with negative subscripts ,e.g., A[-100], we automatically allocate and transfer _more_ data to cover the full array. This is important as such code indeed exists in the wild. llvm-svn: 281611	2016-09-15 14:05:58 +00:00
Roman Gareev	b3224adfb6	Perform copying to created arrays according to the packing transformation This is the fourth patch to apply the BLIS matmul optimization pattern on matmul kernels (http://www.cs.utexas.edu/users/flame/pubs/TOMS-BLIS-Analytical.pdf). BLIS implements gemm as three nested loops around a macro-kernel, plus two packing routines. The macro-kernel is implemented in terms of two additional loops around a micro-kernel. The micro-kernel is a loop around a rank-1 (i.e., outer product) update. In this change we perform copying to created arrays, which is the last step to implement the packing transformation. Reviewed-by: Tobias Grosser <tobias@grosser.es> Differential Revision: https://reviews.llvm.org/D23260 llvm-svn: 281441	2016-09-14 06:26:09 +00:00
Tobias Grosser	e8c69bbabd	cmake: PollyPPCG depends on PollyISL This line makes BUILD_SHARED_LIBS=ON work for Polly-ACC. Without it, ld complains about missing isl symbols when constructing the shared library. llvm-svn: 281396	2016-09-13 21:09:35 +00:00
Tobias Grosser	0a893f7df4	GPGPU: Use const_cast to avoid compiler warning [NFC] llvm-svn: 281333	2016-09-13 13:22:27 +00:00
Michael Kruse	19c9d99f45	Use value directly instead of reference. NFC. The alias to the array element is read-only and a primitive type (pointer), therefore use the value directly instead of a reference to it. llvm-svn: 281311	2016-09-13 09:56:05 +00:00
Tobias Grosser	a82c4b5df8	GPGPU: Allow region statements llvm-svn: 281305	2016-09-13 08:42:10 +00:00
Tobias Grosser	b79f4d3970	GPGPU: Extend types when array sizes have smaller types This prevents a compiler crash. llvm-svn: 281303	2016-09-13 08:02:14 +00:00
Tobias Grosser	b51d507c74	Adapt test case to recent change in Global Variable Definition llvm-svn: 281295	2016-09-13 05:19:26 +00:00
Michael Kruse	e5e752a28b	Remove -fvisibility=hidden and FORCE_STATIC. The flag -fvisibility=hidden flag was used for the integrated Integer Set Library (and PPCG) to keep their definitions local to Polly. The motivation was the be loaded into a DragonEgg-powered GCC, where GCC might itself use ISL for its Graphite extension. The symbols of Polly's ISL and GCC's ISL would clash. The DragonEgg project is not actively developed anymore, but Polly's unittests need to call ISL functions to set up a testing environment. Unfortunately, the -fvisibility=hidden flag means that the ISL symbols are not available to the gtest executable as it resides outside of libPolly when linked dynamically. Currently, CMake links a second copy of ISL into the unittests which leads to subtle bugs. What got observed is that two isl_ids for isl_id_none exist, one for each library instance. Because isl_id's are compared by address, isl_id_none could happen to be different from isl_id_none, depending on which library instance set the address and does the comparison. Also remove the FORCE_STATIC flag which was introduced to keep the ISL symbols visible inside the same libPolly shared object, even when build with BUILD_SHARED_LIBS. Differential Revision: https://reviews.llvm.org/D24460 llvm-svn: 281242	2016-09-12 18:25:00 +00:00
Roman Gareev	f5aff70405	Store the size of the outermost dimension in case of newly created arrays that require memory allocation. We do not need the size of the outermost dimension in most cases, but if we allocate memory for newly created arrays, that size is needed. Reviewed-by: Michael Kruse <llvm@meinersbur.de> Differential Revision: https://reviews.llvm.org/D23991 llvm-svn: 281234	2016-09-12 17:08:31 +00:00
Tobias Grosser	5857b701a3	GPGPU: Bail out gracefully in case of invalid IR Instead of aborting, we now bail out gracefully in case the kernel IR we generate is invalid. This can currently happen in case the SCoP stores pointer values, which we model as arrays, as data values into other arrays. In this case, the original pointer value is not available on the device and can consequently not be stored. As detecting this ahead of time is not so easy, we detect these situations after the invalid IR has been generated and bail out. llvm-svn: 281193	2016-09-12 06:06:31 +00:00
Tobias Grosser	0bf4cc6499	Add missing 'REQUIRES' line llvm-svn: 281166	2016-09-11 13:42:42 +00:00
Tobias Grosser	02293ed755	GPGPU: Do not fail in case of arrays never accessed If these arrays have never been accessed we failed to derive an upper bound of the accesses and consequently a size for the outermost dimension. We now explicitly check for empty access sets and then just use zero as size for the outermost dimension. llvm-svn: 281165	2016-09-11 13:30:12 +00:00
Tobias Grosser	89fda2bde2	GPURuntime: ensure compilation with C99 Otherwise, older compiler will error out on some of the C99 features we use. llvm-svn: 281159	2016-09-11 07:32:50 +00:00
Tobias Grosser	5aea5653b3	FlattenAlgo: Ensure we _really_ obtain a param space This resolves "isl_space.c:1775: not a parameter space" errors I have seen on two systems. llvm-svn: 281052	2016-09-09 16:11:26 +00:00
Tobias Grosser	a6987a4ddd	Add namespace specifier before nullptr_t This fixes the following compile time errors: error: unknown type name 'nullptr_t'; did you mean 'std::nullptr_t' llvm-svn: 281039	2016-09-09 12:31:38 +00:00
Tobias Grosser	a3afe44d6c	IslNodeBuilder: Add missing __isl_take annotation llvm-svn: 281034	2016-09-09 11:16:50 +00:00
Michael Kruse	7886bd7ca5	Add -polly-flatten-schedule pass. The -polly-flatten-schedule pass reduces the number of scattering dimensions in its isl_union_map form to make them easier to understand. It is not meant to be used in production, only for debugging and regression tests. To illustrate, how it can make sets simpler, here is a lifetime set used computed by the porposed DeLICM pass without flattening: { Stmt_reduction_for[0, 4] -> [0, 2, o2, o3] : o2 < 0; Stmt_reduction_for[0, 4] -> [0, 1, o2, o3] : o2 >= 5; Stmt_reduction_for[0, 4] -> [0, 1, 4, o3] : o3 > 0; Stmt_reduction_for[0, i1] -> [0, 1, i1, 1] : 0 <= i1 <= 3; Stmt_reduction_for[0, 4] -> [0, 2, 0, o3] : o3 <= 0 } And here the same lifetime for a semantically identical one-dimensional schedule: { Stmt_reduction_for[0, i1] -> [2 + 3i1] : 0 <= i1 <= 4 } Differential Revision: https://reviews.llvm.org/D24310 llvm-svn: 280948	2016-09-08 15:02:36 +00:00

1 2 3 4 5 ...

2709 Commits