As Polly got a lot faster after the small-integer-optimization imath
patch, we now increase the compute out to optimize larger kernels. This
should also expose additional slow-downs for us to address.
In LNT this gives us a 3.4x speedup on 3mm, at a cost of a 2x increase in
compile time (now 0.77s). reg_detect, oorafft and adi also show some compile
time increases. This compile time cost is divided between more time in isl and
more time in LLVM's backends due to increased code size (versioning and tiling).
llvm-svn: 240840
In case we have modulo operations in the access function (supported since
r240518), the assumptions generated to ensure array accesses remain within
bounds can contain existentially quantified dimensions which results in more
complex and more difficult to handle integer sets. As a result LNT's linpack
benchmark started to fail due to excessive compile time.
We now just drop the existentially quantified dimensions. This should be
generally save, but may result in less precise assumptions which may
consequently make us fall back to the original (unoptimized) code more often. In
practice, these cases probably do not appear to often.
I had difficulties to extract a good test case, but fortunately our LNT bots
cover this one well.
llvm-svn: 240775
This removes old code that has been disabled since several weeks and was hidden
behind the flags -disable-polly-intra-scop-scalar-to-array=false and
-polly-model-phi-nodes=false. Earlier, Polly used to translate scalars and
PHI nodes to single element arrays, as this avoided the need for their special
handling in Polly. With Johannes' patches adding native support for such scalar
references to Polly, this code is not needed any more. After this commit both
-polly-prepare and -polly-independent are now mostly no-ops. Only a couple of
simple transformations still remain, but they are scheduled for removal too.
Thanks again to Johannes Doerfert for his nice work in making all this code
obsolete.
llvm-svn: 240766
Summary:
With small integer optimization (short: sio) enabled, ISL uses 32 bit
integers for its arithmetic and only falls back to a big integer library
(in the case of Polly: IMath) if an operation's result is too large.
This gives a massive performance boost for most application using ISL.
For instance, experiments with ppcg (polyhedral source-to-source
compiler) show speed-ups of 5.8 (compared to plain IMath), respectively
2.7 (compared to GMP).
In Polly, a smaller fraction of the total compile time is taken by ISL,
but the speed-ups are still very significant. The buildbots measure
compilation speed-up up to 1.8 (oourafft, floyd-warshall, symm). All
Polybench benchmarks compile in at least 9% less time, and about 20%
less on average.
Detailed Polybench compile time results (median of 10):
correlation -25.51%
covariance -24.82%
2mm -26.64%
3mm -28.69%
atax -13.70%
bicg -10.78%
cholesky -40.67%
doitgen -11.60%
gemm -11.54%
gemver -10.63%
gesummv -11.54%
mvt -9.43%
symm -41.25%
syr2k -14.71%
syrk -14.52%
trisolv -17.65%
trmm -9.78%
durbin -19.32%
dynprog -9.09%
gramschmidt -15.38%
lu -21.77%
floyd-warshall -42.71%
reg_detect -41.17%
adi -36.69%
fdtd-2d -32.61%
fdtd-apml -21.90%
jacobi-1d-imper -9.41%
jacobi-2d-imper -27.65%
seidel-2d -31.00%
Reviewers: grosser
Reviewed By: grosser
Subscribers: Meinersbur, llvm-commits, pollydev
Projects: #polly
Differential Revision: http://reviews.llvm.org/D10506
llvm-svn: 240689
There were two issues:
* ISL's configure generates include/isl/stdint.h, not isl/stdint.h as
assumed. This is also changed in the CMake build.
* Need to pass --with-int=imath to ISL's configure; the default is gmp.
Polly's configure has been regenerated due to changing configure.ac
llvm-svn: 240657
Remainder operations with constant divisor can be modeled as quasi-affine
expression. This patch adds support for detecting and modeling them. We also
add a test that ensures they are correctly code generated.
This patch was extracted from a larger patch contributed by Johannes Doerfert
in http://reviews.llvm.org/D5293
llvm-svn: 240518
This makes the test cases nonaffine even if Polly some days gains support for
the srem instruction, an instruction which is currently not modeled but which
can clearly be modeled statically. A call to a function without definition
will always remain non-affine, as there is just insufficient static information
for it to be modeled more precisely.
llvm-svn: 240458
ISL with small integer optimization requires C99 to compile. gcc < 5.0
still uses C89 as default, so we need to enable the options to compile
in C99 mode.
This patch is preparing the actual activation of small integer
optimization.
Differential version: http://reviews.llvm.org/D10610
Reviewers: grosser
llvm-svn: 240322
ISL's ./configure examines the system for the stdint.h to include and
creates a header file that points to it. On C99-compatible system
#include <stdint.h>
is always valid such there no need for system introspection. This should
unbreak the build bots.
llvm-svn: 240315
The 'make dist' archive is not dependent on ./configure output and
contains a GIT_HEAD_ID file that identifies the version of ISL used.
None of the files added or removed are used part of Polly's build
process (except of GIT_HEAD_ID since the previous revision r240301). No
functional change intended.
llvm-svn: 240306
Currently the Polly repository contains the ISL sources with bogus
isl_config.h and gitversion.h. This is problematic. In this state a
macro
#define __attribute__(x)
becomes active in the source, leading to various problems e.g. when
included before system header files. This patch will instead generate
the two files specific to the host system at configure-time.
For CMake, we replicate the tests that ISL's configure performs using
try_compile(). In autotools build, we just invoke ISL's configure to
generate the two files. This consequently required regenerating
autoconf/configure.
'make dist' distributions of ISL contain a file GIT_HEAD_ID which
contains the version the distribution is derived from. The repository
files themselves do not contain such a hint. In a later commit we will
replace the isl directory by the contents of such a .tar.gz. It does
not contain the files imdrover.c iprime.c pi.c and rsamath.c currently
compiled into Polly, but not used and therefore are removed by this
patch.
In the long term we plan to generate a dedicated library for ISL instead
of adding its files to Polly.
This also does not yet include the switch to small-integer optimized ISL
nor enabling C99 mode required for the former. Those will come as well
in separate patches.
Differential version: http://reviews.llvm.org/D10603
Reviewers: grosser
llvm-svn: 240301
This was meant to committed in r240027, but was left behind because
svn, in contrast to git, only commits the changes in the directory you
are currently in.
llvm-svn: 240034
This version adds small integer optimization, but is not active by
default. It will be enabled in a later commit.
The schedule-fuse=min/max option has been replaced by the
serialize-sccs option. Adapting Polly was necessary, but retaining the
name polly-opt-fusion=min/max.
Differential Revision: http://reviews.llvm.org/D10505
Reviewers: grosser
llvm-svn: 240027
LLVM's instcombine already translates power-of-two sdivs that are known to be
exact to fast ashr instructions. Hence, there is no need to add this logic
ourselves.
Pointed-out-by: Johannes Doerfert
llvm-svn: 239025
We now verify that memory access functions imported via JSON are indeed defined
for the full iteration domain. Before this change we accidentally imported
memory mappings such as i -> i / 127, which only defined a mapped for values of
i that are evenly divisible by 127, but which did not define any mapping for the
remaining values, with the result that isl just generated an access expression
that had undefined behavior for all the unmapped values.
In the incorrect test cases, we now either use floor(i/127) or we use p/127 and
provide the information that p is indeed a multiple of 127.
llvm-svn: 239024
floord(a,b) === a ashr log_2 (b) holds for positive and negative a's, but
shifting only makes sense for positive values of b. The previous patch did
not consider this as isl currently always produces postive b's. To avoid future
surprises, we check that b is positive and only then apply the optimization.
We also now correctly check the return value of the dyn-cast.
No additional test case, as isl currently does not produce negative
denominators.
Reported-by: David Majnemer <david.majnemer@gmail.com>
llvm-svn: 238927
Running indvar before Polly is useful as this eliminates zexts as they commonly
appear when a 32 bit induction variable (type int) was used on a 64 bit system.
These zexts confuse our delinearization and prevent for example the successful
delinearization of the nussinov kernel in polybench-c-4.1.
This fixes http://llvm.org/PR23426
Suggested-by: Xing Su <xsu.llvm@outlook.com>
llvm-svn: 238643
isl marks known non-negative numerators in modulo (and soon also division)
operations. We now exploit this by generating unsigned operations. This is
beneficial as unsigned operations with power-of-two denominators will be
translated by isl to fast bitshift or bitwise and operations.
llvm-svn: 238577
David Blaikie:
"find returns an iterator by value, so it's just added complexity/strangeness to
then use reference lifetime extension to give it the same semantics as if you'd
used a value type instead of a reference type."
llvm-svn: 238294
David Blaike suggested this as an alternative to the use of owningptr(s) for our
memory management, as value semantics allow to avoid the additional interface
complexity caused by owningptr while still providing similar memory consistency
guarantees. We could also have used a std::vector, but the use of std::vector
would yield possibly changing pointers which currently causes problems as for
example the memory accesses carry pointers to their parent statements. Such
pointers should not change.
Reviewer: jblaikie, jdoerfert
Differential Revision: http://reviews.llvm.org/D10041
llvm-svn: 238290
While looking through the test cases I realized we did not have a CHECK line
for a duplicate memory access which we may want to eliminate later. To ensure
we do not have (or later introduce) unnecessary memory accesses, we now tighten
the test cases to look for such a pattern (and add the CHECK: line that shows
the redundant memory access).
llvm-svn: 238227
The feature itself has been committed by Johannes in r238070. As this is the
way forward, we now enable it to ensure we get test coverage.
Thank you Johannes for this nice work!
llvm-svn: 238088
This ensures we pass all tests independently of how we set the options
-disable-polly-intra-scop-scalar-to-array and -polly-model-phi-nodes.
(At least if we enable both or disable both. Enabling them individually makes
little sense, as they will hopefully disappear soon anyhow).
llvm-svn: 238087
To reduce compile time and to allow more and better quality SCoPs in
the long run we introduced scalar dependences and PHI-modeling. This
patch will now allow us to generate code if one or both of those
options are set. While the principle of demoting scalars as well as
PHIs to memory in order to communicate their value stays the same,
this allows to delay the demotion till the very end (the actual code
generation). Consequently:
- We __almost__ do not modify the code if we do not generate code
for an optimized SCoP in the end. Thus, the early exit as well as
the unprofitable option will now actually preven us from
introducing regressions in case we will probably not get better
code.
- Polly can be used as a "pure" analyzer tool as long as the code
generator is set to none.
- The original SCoP is almost not touched when the optimized version
is placed next to it. Runtime regressions if the runtime checks
chooses the original are not to be expected and later
optimizations do not need to revert the demotion for that part.
- We will generate direct accesses to the demoted values, thus there
are no "trivial GEPs" that select the first element of a scalar we
demoted and treated as an array.
Differential Revision: http://reviews.llvm.org/D7513
llvm-svn: 238070