Commit Graph

2861 Commits

Author SHA1 Message Date
Tobias Grosser 91f02e192a www: Add pollylabs news to navigation
llvm-svn: 291385
2017-01-08 08:30:11 +00:00
Tobias Grosser dd135157ff www: Add news for HiPEAC 2017
llvm-svn: 291384
2017-01-08 08:25:41 +00:00
Tobias Grosser cdbe5c9d6c Fix some typos in comments
llvm-svn: 291247
2017-01-06 17:30:34 +00:00
Tobias Grosser 94e5371dde Update to isl-0.18-43-g0b4256f
Even more isl coalesce changes.

llvm-svn: 290783
2016-12-31 07:46:11 +00:00
Tobias Grosser ba3ea97689 Update to isl-0.18-28-gccb9f33
Another set of isl coalesce changes.

llvm-svn: 290681
2016-12-28 19:35:49 +00:00
Tobias Grosser 600941351e Update to isl-0.18-17-g2844ebf
This update improves isl's ability to coalesce different convex sets/maps,
especially when the contain existentially quantified variables.

llvm-svn: 290538
2016-12-26 12:11:40 +00:00
Roman Gareev 1c2927b209 Specify the default values of the cache parameters
If the parameters of the target cache (i.e., cache level sizes, cache level
associativities) are not specified or have wrong values, we use ones for
parameters of the macro-kernel and do not perform data-layout optimizations of
the matrix multiplication. In this patch we specify the default values of the
cache parameters to be able to apply the pattern matching optimizations even in
this case. Since there is no typical values of this parameters, we use the
parameters of Intel Core i7-3820 SandyBridge that also help to attain the
high-performance on IBM POWER System S822 and IBM Power 730 Express server.

Reviewed-by: Tobias Grosser <tobias@grosser.es>

Differential Revision: https://reviews.llvm.org/D28090

llvm-svn: 290518
2016-12-25 16:32:28 +00:00
Tobias Grosser 0791d5f5aa ScheduleOptimizer: Fix spelling of option '-polly-target-throughput-vector-fma'
througput -> throughput

llvm-svn: 290418
2016-12-23 07:33:39 +00:00
Tobias Grosser ccae1ee4df Update isl to isl-0.18-9-gd4734f3
llvm-svn: 290389
2016-12-22 23:08:57 +00:00
Roman Gareev be5299af0b Change the determination of parameters of macro-kernel
Typically processor architectures do not include an L3 cache, which means that
Nc, the parameter of the micro-kernel, is, for all practical purposes,
redundant ([1]). However, its small values can cause the redundant packing of
the same elements of the matrix A, the first operand of the matrix
multiplication. At the same time, big values of the parameter Nc can cause
segmentation faults in case the available stack is exceeded.

This patch adds an option to specify the parameter Nc as a multiple of
the parameter of the micro-kernel Nr.

In case of Intel Core i7-3820 SandyBridge and the following options,

clang -O3 gemm.c -I utilities/ utilities/polybench.c -DPOLYBENCH_TIME
-march=native -mllvm -polly -mllvm -polly-pattern-matching-based-opts=true
-DPOLYBENCH_USE_SCALAR_LB -mllvm -polly-target-cache-level-associativity=8,8
-mllvm -polly-target-cache-level-sizes=32768,262144 -mllvm
-polly-target-latency-vector-fma=8

it helps to improve the performance from 11.303 GFlops/sec (39,247% of
theoretical peak) to 17.896 GFlops/sec (62,14% of theoretical peak).

Refs.:

[1] - http://www.cs.utexas.edu/users/flame/pubs/TOMS-BLIS-Analytical.pdf

Reviewed-by: Tobias Grosser <tobias@grosser.es>

Differential Revision: https://reviews.llvm.org/D28019

llvm-svn: 290256
2016-12-21 12:51:12 +00:00
Roman Gareev bd5c6039c6 Align newly created arrays to the first level cache line boundary
Aligning data to cache lines boundaries helps to avoid overheads related to
an access to it ([1]). This patch aligns newly created arrays and adds an
option to specify the first level cache line size. By default we use 64 bytes,
which is a typical cache-line size ([2]).

In case of Intel Core i7-3820 SandyBridge and the following options,

clang -O3 gemm.c -I utilities/ utilities/polybench.c -DPOLYBENCH_TIME
-march=native -mllvm -polly -mllvm -polly-pattern-matching-based-opts=true
-DPOLYBENCH_USE_SCALAR_LB -mllvm -polly-target-cache-level-associativity=8,8
-mllvm -polly-target-cache-level-sizes=32768,262144 -mllvm
-polly-target-latency-vector-fma=8

it helps to improve the performance from 11.303 GFlops/sec (39,247% of
theoretical peak) to 12.63 GFlops/sec (43,8542% of theoretical peak).

Refs.:

[1] - http://www.alexonlinux.com/aligned-vs-unaligned-memory-access
[2] - http://igoro.com/archive/gallery-of-processor-cache-effects/

Differential Revision: https://reviews.llvm.org/D28020

Reviewed-by: Tobias Grosser <tobias@grosser.es>
llvm-svn: 290253
2016-12-21 12:37:36 +00:00
Roman Gareev 92c446016a [Polly] Use three-dimensional arrays to store packed operands of the matrix
multiplication

Previously we had two-dimensional accesses to store packed operands of
the matrix multiplication for the sake of simplicity of the packed arrays.
However, addition of the third dimension helps to simplify the corresponding
memory access, reduce the execution time of isl operations applied to it, and
consequently reduce the compile-time of Polly. For example, in case of
Intel Core i7-3820 SandyBridge and the following options,

clang -O3 gemm.c -I utilities/ utilities/polybench.c -DPOLYBENCH_TIME
-march=native -mllvm -polly -mllvm -polly-pattern-matching-based-opts=true
-DPOLYBENCH_USE_SCALAR_LB -mllvm -polly-target-cache-level-associativity=8,8
-mllvm -polly-target-cache-level-sizes=32768,262144 -mllvm
-polly-target-latency-vector-fma=7

it helps to reduce the compile-time from about 361.456 seconds to about 0.816
seconds.

Reviewed-by: Michael Kruse <llvm@meinersbur.de>,
             Tobias Grosser <tobias@grosser.es>

Differential Revision: https://reviews.llvm.org/D27878

llvm-svn: 290251
2016-12-21 11:18:42 +00:00
Adrian Prantl abd69332e2 Fix debug info metadata for upstream change in LLVM.
llvm-svn: 290154
2016-12-20 02:09:59 +00:00
Tobias Grosser b6945e3301 Fix clang-format
llvm-svn: 290103
2016-12-19 14:06:40 +00:00
Daniel Jasper e5f3eba9c3 Fix format after recent clang-format change.
llvm-svn: 290085
2016-12-19 07:54:15 +00:00
Alexandre Isoard cbed3ce39f Add isl_multi_pw_aff to GICHelper
Add isl_multi_pw_aff* to GICHelper and add some missing isl_pw_multi_aff* handlers.

llvm-svn: 290007
2016-12-16 23:41:26 +00:00
Adrian Prantl 80d13b4545 Revert "Fix debug info metadata for upstream change in LLVM."
llvm-svn: 289983
2016-12-16 19:39:18 +00:00
Adrian Prantl e0a1bdad3f Fix debug info metadata for upstream change in LLVM.
llvm-svn: 289953
2016-12-16 16:17:24 +00:00
Roman Gareev 2606c48a1d Restrict ranges of extension maps
To prevent copy statements from accessing arrays out of bounds, ranges of their
extension maps are restricted, according to the constraints of domains.

Reviewed-by: Michael Kruse <llvm@meinersbur.de>

Differential Revision: https://reviews.llvm.org/D25655

llvm-svn: 289815
2016-12-15 12:35:59 +00:00
Roman Gareev 15db81ef71 [NFC] Fix typos in getMacroKernelParams.
llvm-svn: 289808
2016-12-15 12:00:57 +00:00
Roman Gareev 8babe1a216 The order of the loops defines the data reused in the BLIS implementation of
gemm ([1]). In particular, elements of the matrix B, the second operand of
matrix multiplication, are reused between iterations of the innermost loop.
To keep the reused data in cache, only elements of matrix A, the first operand
of matrix multiplication, should be evicted during an iteration of the
innermost loop. To provide such a cache replacement policy, elements of the
matrix A can, in particular, be loaded first and, consequently, be
least-recently-used.

In our case matrices are stored in row-major order instead of column-major
order used in the BLIS implementation ([1]). One of the ways to address it is
to accordingly change the order of the loops of the loop nest. However, it
makes elements of the matrix A to be reused in the innermost loop and,
consequently, requires to load elements of the matrix B first. Since the LLVM
vectorizer always generates loads from the matrix A before loads from the
matrix B and we can not provide it. Consequently, we only change the BLIS micro
kernel and the computation of its parameters instead. In particular, reused
elements of the matrix B are successively multiplied by specific elements of
the matrix A .

Refs.:
[1] - http://www.cs.utexas.edu/users/flame/pubs/TOMS-BLIS-Analytical.pdf

Reviewed-by: Tobias Grosser <tobias@grosser.es>

Differential Revision: https://reviews.llvm.org/D25653

llvm-svn: 289806
2016-12-15 11:47:38 +00:00
Michael Kruse 7037fde427 Remove references to AssumptionCache. NFC.
The AssumptionCache was removed in r289756 after being replaced by the an
addtional operand list of affected values in r289755. The absence of that cache
means that we have now have to manually search for llvm.assume intrinsics as
now done by other passes (LazyValueInfo, CodeMetrics) do not take into
account an llvm::Instruction's user lists (ScalarEvolution).

llvm-svn: 289791
2016-12-15 09:25:14 +00:00
Tobias Grosser b02b6a8404 Adjust clang-format formatting to r289531
clang-format has been updated in r289531 to keep labels and values on
the same line. This change updates Polly to the new formatting style.

llvm-svn: 289533
2016-12-13 12:44:00 +00:00
Michael Kruse 79c0173f53 [ScheduleOptimizer] Fix memory leak. NFC.
llvm-svn: 289434
2016-12-12 14:51:06 +00:00
Michael Kruse 209b463240 Add unittests for foreach(Elt|Piece). NFC.
llvm-svn: 288925
2016-12-07 17:48:02 +00:00
Michael Kruse b9a683d75d Add more ISL foreachElt functions. NFC.
Add and implement foreachElt for isl_map, isl_set and isl_union_set. These are
used by an out-of-tree patch which is in process of being upstreamed.

llvm-svn: 288924
2016-12-07 17:47:57 +00:00
Michael Kruse 2ead2bfc12 Add IslPtr type traits. NFC.
Add traits for isl_id and isl_multi_aff, required by out-of-tree patches
currently in progress of upstreaming.

isl_union_pw_aff_dump has been added to ISL during one of the last ISL
updates, such that we can also enable its dump() trait.

llvm-svn: 288915
2016-12-07 16:17:59 +00:00
Michael Kruse 1b8eb4104b Update to isl-0.17.1-314-g3106e8d
This version includes an update for imath (isl-0.17.1-49-g2f1c129). It fixes
the compilation under windows, which does not know ssize_t.

In addition, isl-0.17.1-288-g0500299 changed the way isl_test finds the source
directory. It now generates a file isl_srcdir.c at configure-time, containing
the source path, to not require setting the environment variable "srcdir" at
test-time. The cmake build system had to be modified to also generate that file.

llvm-svn: 288811
2016-12-06 14:37:39 +00:00
Johannes Doerfert bda814350a Allow to disable unsigned operations (zext, icmp ugt, ...)
Unsigned operations are often useful to support but the heuristics are
not yet tuned. This options allows to disable them if necessary.

llvm-svn: 288521
2016-12-02 17:55:41 +00:00
Johannes Doerfert a94ae1aede Do not allow multiple possibly aliasing ptrs in an expression
Relational comparisons should not involve multiple potentially
  aliasing pointers. Similarly this should hold for switch conditions
  and the two conditions involved in equality comparisons (separately!).
  This is a heuristic based on the C semantics that does only allow such
  operations when the base pointers do point into the same object.
  Since this makes aliasing likely we will bail out early instead of
  producing a probably failing runtime check.

llvm-svn: 288516
2016-12-02 17:49:52 +00:00
Johannes Doerfert 2df9963fe3 Rerun mem2reg after the inliner
It did happen that after the inliner finished we end up with promotable
allocas in a function. We now run mem2reg to make sure everything is
promoted if possible.

llvm-svn: 288514
2016-12-02 17:43:57 +00:00
Tobias Grosser bedef00e2c [ScopInfo] Fold constant coefficients in array dimensions to the right
This allows us to delinearize code such as the one below, where the array
sizes are A[][2 * n] as there are n times two elements in the innermost
dimension. Alternatively, we could try to generate another dimension for the
struct in the innermost dimension, but as the struct has constant size,
recovering this dimension is easy.

   struct com {
     double Real;
     double Img;
   };

   void foo(long n, struct com A[][n]) {
     for (long i = 0; i < 100; i++)
       for (long j = 0; j < 1000; j++)
         A[i][j].Real += A[i][j].Img;
   }

   int main() {
     struct com A[100][1000];
     foo(1000, A);

llvm-svn: 288489
2016-12-02 08:10:56 +00:00
Tobias Grosser 491b799a4d [ScopInfo] Separate construction and finalization of memory accesses [NFC]
After having built memory accesses we perform some additional transformations
on them to increase the chances that our delinearization guesses the right
shape. Only after these transformations, we take the assumptions that the
array shape we predict is such that no out-of-bounds memory accesses arise.

Before this change, the construction of the memory access, the access folding
that improves the represenation for certain parametric subscripts, and taking
the assumption was all done right after a memory access was created. In this
change we split this now into three separate iterations over all memory
accesses. This means only after all memory accesses have been built, we start
to canonicalize accesses, and to take assumptions. This split prepares for
future canonicalizations that must consider all memory accesses for deriving
additional beneficial transformations.

llvm-svn: 288479
2016-12-02 05:21:22 +00:00
Johannes Doerfert b1d6608430 [NFC] Check for feasibility prior to the profitability check
Feasibility is checked late on its own but early it is hidden behind
the "PollyProcessUnprofitable" guard. This change will make sure we opt
out early if the runtime context is infeasible anyway.

llvm-svn: 288329
2016-12-01 11:12:14 +00:00
Johannes Doerfert b6c5a5dd01 [FIX] Do not try to hoist obviously overwritten loads
llvm-svn: 288328
2016-12-01 11:10:45 +00:00
Tobias Grosser dc6b87c56e Add newline at end of debug print
In '[DBG] Allow to emit the RTC value at runtime' the diagnostics were printed
without a newline at the end of each diagnostic. We add such a newline to
improve readability.

llvm-svn: 288323
2016-12-01 08:08:47 +00:00
Eugene Zelenko b3ee0ba7a1 Fix some Clang-tidy modernize-use-default and Include What You Use warnings; other minor fixes (NFC).
This preparation to remove SetVector.h dependency on SmallSet.h.

llvm-svn: 288174
2016-11-29 18:14:12 +00:00
Michael Kruse 36e79ecaec [DeLICM] Add pass boilerplate code.
Add an empty DeLICM pass, without any functional parts.

Extracting the boilerplate from the the functional part reduces the size of the
code to review (https://reviews.llvm.org/D24716)

Suggested-by: Tobias Grosser <tobias@grosser.es>
llvm-svn: 288160
2016-11-29 16:41:21 +00:00
Michael Kruse 11c5e07925 canSynthesize: Remove unused argument LI. NFC.
The helper function polly::canSynthesize() does not directly use the LoopInfo
analysis, hence remove it from its argument list.

llvm-svn: 288144
2016-11-29 15:11:04 +00:00
Tobias Grosser df8f35b7b8 Update for clang-format change in r288119
llvm-svn: 288134
2016-11-29 12:52:08 +00:00
Tobias Grosser 278f9e7d27 [ScopInfo] Use SCEVRewriteVisitor to simplify SCEVSensitiveParameterRewriter [NFC]
llvm-svn: 287984
2016-11-26 17:58:40 +00:00
Tobias Grosser b45ae5601b [ScopDetect] Expand statistics of the detected scops
We now collect:

  Number of total loops
  Number of loops in scops
  Number of scops
  Number of scops with maximal loop depth 1
  Number of scops with maximal loop depth 2
  Number of scops with maximal loop depth 3
  Number of scops with maximal loop depth 4
  Number of scops with maximal loop depth 5
  Number of scops with maximal loop depth 6 and larger
  Number of loops in scops (profitable scops only)
  Number of scops (profitable scops only)
  Number of scops with maximal loop depth 1 (profitable scops only)
  Number of scops with maximal loop depth 2 (profitable scops only)
  Number of scops with maximal loop depth 3 (profitable scops only)
  Number of scops with maximal loop depth 4 (profitable scops only)
  Number of scops with maximal loop depth 5 (profitable scops only)
  Number of scops with maximal loop depth 6 and larger (profitable scops only)

These statistics are certainly completely accurate as we might drop scops
when building up their polyhedral representation, but they should give a good
indication of the number of scops we detect.

llvm-svn: 287973
2016-11-26 07:37:46 +00:00
Tobias Grosser 5c00b0dc74 [ScopDetectionDiagnostic] Collect statistics for each diagnostic type
Our original statistics were added before we introduced a more fine-grained
diagnostic system, but the granularity of our statistics has never been
increased accordingly. This change introduces now one statistic counter per
diagnostic to enable us to collect fine-grained statistics about who certain
scops are not detected. In case coarser grained statistics are needed, the
user is expected to combine counters manually.

llvm-svn: 287968
2016-11-26 05:53:09 +00:00
Tobias Grosser 0dcbcaa98b [ScopDetectionDiagnostic] IrreducibleRegion is a subclasses of CFG
Reflect this correctly in the RejectReasonKind enum. The definition of
RejectReasonKind::IrreducibleRegion was introduced in r258497, when we started
to refuse regions containing irreducible loops.

llvm-svn: 287965
2016-11-26 05:08:27 +00:00
Tobias Grosser 8c21b1a50f [ScopDetectionDiagnostic] Remove leftover RejectReasonKind for Conditions [NFC]
In r248118 some diagnostics for unstructured control flow have been removed,
but the corresponding RejectReasonKind was accidentally not removed. This
change removes it, as it is not needed any more.

llvm-svn: 287964
2016-11-26 05:08:24 +00:00
Tobias Grosser c64269ea1b [ScopDectionDiagnostic] Use scoped enums instead three letter prefix [NFC]
This improves readability of the code.

llvm-svn: 287963
2016-11-26 03:44:31 +00:00
Hongbin Zheng 4ff1c3983d Fix typo.
llvm-svn: 287819
2016-11-23 21:59:33 +00:00
Tobias Grosser eab0943ec0 Update to isl-0.17.1-284-gbb38638
Regular maintenance update with only minor changes.

llvm-svn: 287703
2016-11-22 21:31:59 +00:00
Tobias Grosser b3c3d149b9 [CodeGen] Add flag to code-generate most memory access expressions
Introduce the new flag -polly-codegen-generate-expressions which forces Polly
to code generate AST expressions instead of using our SCEV based access
expression generation even for cases where the original memory access relation
was not changed and the SCEV based access expression could be code generated
without any issue.

This is an experimental option for better testing the isl ast expression
generation. The default behavior of Polly remains unchanged. We also exclude
a couple of cases for which the AST expression is not yet working.

llvm-svn: 287694
2016-11-22 20:21:16 +00:00
Tobias Grosser d51a945f38 [test] Simplify test case by removing unreferenced instructions [NFC]
Drop instructions that do not influence the memory impact of a basic block.
They are not needed to reproduce the original bug (verified) and will cause
random test noise if we would decide to only model the instructions that
have visible side-effects.

llvm-svn: 287626
2016-11-22 07:18:57 +00:00
Tobias Grosser 88c025e82e [test] Ensure important basic blocks in test case have side effects
Add two store instructions at the end of basic blocks that are required to
reproduce the original bug to ensure we always process and model these basic
blocks. This makes this test case stable even in case we would decide to bail
out early of basic blocks which do not modify the global state. Also add
additional check lines to verify how we model the basic block.

llvm-svn: 287625
2016-11-22 07:06:59 +00:00
Tobias Grosser 07ce9a0bcc test: add more details to non-affine test case
We add CHECK lines to this test case to make it easier to see the difference
between affine and non-affine memory accesses. We also change the test case to
use a parameteric index expression as otherwise our range analysis will
understand that the non-affine memory access can only access input[1],
which makes it difficult to see that the memory access is in-fact modeled as
non-affine access.

llvm-svn: 287623
2016-11-22 06:28:08 +00:00
Hongbin Zheng d559da84af Update comment for r287566
llvm-svn: 287572
2016-11-21 20:27:55 +00:00
Hongbin Zheng dac3808873 Fix format
llvm-svn: 287568
2016-11-21 20:17:00 +00:00
Hongbin Zheng a8fb73fc0b Split ScopInfo::addScopStmt into two versions. NFC
One for adding statement for region, another one for BB

llvm-svn: 287566
2016-11-21 20:09:40 +00:00
Hongbin Zheng 3ffa6f40b0 Minor change
llvm-svn: 287562
2016-11-21 19:26:10 +00:00
Tobias Grosser 1f0236d8e5 [ScopDetect] Use mayReadOrWriteMemory to shorten condition
llvm-svn: 287525
2016-11-21 09:07:30 +00:00
Tobias Grosser b94e9b31d0 [ScopDetect] Remove unnecessary namespace qualifier
llvm-svn: 287524
2016-11-21 09:04:45 +00:00
Johannes Doerfert 81aa6e882f [NFC] Adjust naming scheme of statistic variables
Suggested-by: Tobias Grosser <tobias@grosser.es>
llvm-svn: 287347
2016-11-18 14:37:08 +00:00
Johannes Doerfert 6cd59e9076 Probably overwritten loads should not be considered hoistable
Do not assume a load to be hoistable/invariant if the pointer is used by
another instruction in the SCoP that might write to memory and that is
always executed.

llvm-svn: 287272
2016-11-17 22:25:17 +00:00
Johannes Doerfert 50dfbc572a [NFC] Add flag to disable error block assumptions
The declaration as an "error block" is currently aggressive and not very
smart. This patch allows to disable error blocks completely. This might
be useful to prevent SCoP expansion to a point where the assumed context
becomes infeasible, thus the SCoP has to be discarded.

llvm-svn: 287271
2016-11-17 22:16:35 +00:00
Johannes Doerfert c97654681e [FIX] Do not try to hoist memory intrinsic
Since we do not necessarily treat memory intrinsics as non-affine
anymore, we have to check for them explicitly before we try to hoist an
access.

llvm-svn: 287270
2016-11-17 22:11:56 +00:00
Johannes Doerfert b3265a3612 [NFC] Skip over trivial assumptions
Filter trivial assumptions, thus assume { : } or restrict { : 0 = 1 },
as they clutter the user output as well as the statistics.

llvm-svn: 287269
2016-11-17 22:08:40 +00:00
Johannes Doerfert dae2e9287d [DBG] Collect statistics about actually versioned SCoPs
llvm-svn: 287267
2016-11-17 21:55:43 +00:00
Johannes Doerfert 8c5464a715 [DBG] Allow to emit the RTC value at runtime
The new command line flag "polly-codegen-emit-rtc-print" can be used to
place a "printf" in the generated code that will print the RTC value and
the overflow state.

llvm-svn: 287265
2016-11-17 21:49:19 +00:00
Johannes Doerfert cfadb2293f [DBG] Collect statistics about statically infeasible SCoPs
llvm-svn: 287263
2016-11-17 21:44:47 +00:00
Johannes Doerfert cd195326bf [DBG] Collect statistics about taken assumptions
llvm-svn: 287261
2016-11-17 21:41:08 +00:00
Tobias Grosser 06e1592663 Update to isl-0.17.1-267-gbf9723d
This update corrects an incorrect generation of min/max expressions in the isl
AST generator and a problematic nullptr dereference.

llvm-svn: 287098
2016-11-16 11:06:47 +00:00
Tobias Grosser 26be8e99b6 [ScopBuilder] Drop unnecessary namespace identifiers [NFC]
llvm-svn: 286781
2016-11-13 21:28:13 +00:00
Tobias Grosser 5743e8de86 [SCEVAffinator] Do not scan redundantly for parameters
In r286430 "SCEVValidator: add new parameters resulting from constant
extraction" we added functionality to scan for parameters after constant
extraction has taken place to ensure newly created parameters are correctly
registered. This addition made the already existing registration of parameters
redundant. Hence, we remove the corresponding call in this commit.

An alternative solution would have been to also perform constant extraction when
validating SCEV expressions and to then scan for parameters when validating
a SCEV expression. However, as SCEV validation is used during SCoP detection
where we want to be especially fast, adding additional functionality on this
hot path should be avoided if good alternatives exist. In this case, we can
choose to continue to only transform SCEV expression when actually modeling
them. As all transformations we perform are expected to not change the validity
of the SCEV expressions, this solution seems preferable.

Suggested-by: Eli Friedman <efriedma@codeaurora.org>
llvm-svn: 286780
2016-11-13 21:28:07 +00:00
Tobias Grosser 70d2709b1a [ScopDetect] Conservatively handle inaccessible memory alias attributes
Commit r286294 introduced support for inaccessiblememonly and
inaccessiblemem_or_argmemonly attributes to BasicAA, which we need to
support to avoid undefined behavior. This change just refuses all calls
which are annotated with these attributes, which is conservatively correct.
In the future we may consider to model and support such function calls
in Polly.

llvm-svn: 286771
2016-11-13 19:27:24 +00:00
Tobias Grosser a9cac6a732 [tests] Adjust test output to recent changed SCEV canonocalization [NFC]
LLVM recently changed the SCEV canonicalization which changed the output of
one of our GPGPU test cases.

llvm-svn: 286770
2016-11-13 19:27:17 +00:00
Tobias Grosser a2f8fa33aa [ScopDetect] Evaluate and verify branches at branch condition, not icmp
The validity of a branch condition must be verified at the location of the
branch (the branch instruction), not the location of the icmp that is
used in the branch instruction. When verifying at the wrong location, we
may accept an icmp that is defined within a loop which itself dominates, but
does not contain the branch instruction. Such loops cannot be modeled as
we only introduce domain dimensions for surrounding loops. To address this
problem we change the scop detection to evaluate and verify SCEV expressions at
the right location.

This issue has been around since at least r179148 "scop detection: properly
instantiate SCEVs to the place where they are used", where we explicitly
set the scope to the wrong location. Before this commit the scope
was not explicitly set, which probably also resulted in the scope around the
ICmp to be choosen.

This resolves http://llvm.org/PR30989

Reported-by: Eli Friedman <efriedma@codeaurora.org>
llvm-svn: 286769
2016-11-13 19:27:04 +00:00
Tobias Grosser f67433abd9 SCEVAffinator: pass parameter-only set to addRestriction if BB=nullptr
Assumptions can either be added for a given basic block, in which case the set
describing the assumptions is expected to match the dimensions of its domain.
In case no basic block is provided a parameter-only set is expected to describe
the assumption.

The piecewise expressions that are generated by the SCEVAffinator sometimes
have a zero-dimensional domain (e.g., [p] -> { [] : p <= -129 or p >= 128 }),
which looks similar to a parameter-only domain, but is still a set domain.

This change adds an assert that checks that we always pass parameter domains to
addAssumptions if BB is empty to make mismatches here fail early.

We also change visitTruncExpr to always convert to parameter sets, if BB is
null. This change resolves http://llvm.org/PR30941

Another alternative to this change would have been to inspect all code to make
sure we directly generate in the SCEV affinator parameter sets in case of empty
domains. However, this would likely complicate the code which combines parameter
and non-parameter domains when constructing a statement domain. We might still
consider doing this at some point, but as this likely requires several non-local
changes this should probably be done as a separate refactoring.

Reported-by: Eli Friedman <efriedma@codeaurora.org>
llvm-svn: 286444
2016-11-10 11:44:10 +00:00
Tobias Grosser d0b9173caa IslAst: always use the context during ast generation
Providing the context to the ast generator allows for additional simplifcations
and -- more importantly -- allows to generate loops with only partially bounded
domains, assuming the domains are bounded for all parameter configurations
that are valid as defined by the context.

This change fixes the crash reported in http://llvm.org/PR30956

The original reason why we did not include the context when generating an
AST was that CLooG and later isl used to sometimes transfer some of the
constraints that bound the size of parameters from the context into the
generated AST. This resulted in operations with very large constants, which
sometimes introduced problematic integer overflows. The latest versions of
the isl AST generator are careful to not introduce such constants.

Reported-by: Eli Friedman <efriedma@codeaurora.org>
llvm-svn: 286442
2016-11-10 09:39:58 +00:00
Tobias Grosser 4d543d654a SCEVValidator: add new parameters resulting from constant extraction
When extracting constant expressions out of SCEVs, new parameters may be
introduced, which have not been registered before. This change scans
SCEV expressions after constant extraction again to make sure newly
introduced parameters are registered.

We may for example extract the constant '8' from the expression '((8 * ((%a *
%b) + %c)) + (-8 * %a))' and obtain the expression '(((-1 + %b) * %a) + %c)'.
The new expression has a new parameter '(-1 + %b) * %a)', which was not
registered before, but must be registered to not crash.

This closes http://llvm.org/PR30953

Reported-by: Eli Friedman <efriedma@codeaurora.org>
llvm-svn: 286430
2016-11-10 06:45:28 +00:00
Tobias Grosser bbaeda3fe5 Do not allow switch statements in loop latches
In r248701 "Allow switch instructions in SCoPs" support for switch statements
has been introduced, but support for switch statements in loop latches was
incomplete. This change completely disables switch statements in loop latches.

The original commit changed addLoopBoundsToHeaderDomain to support non-branch
terminator instructions, but this change was incorrect: it added a check for
BI != null to the if-branch of a condition, but BI was used in the else branch
es well. As a result, when a non-branch terminator instruction is encounted a
nullptr dereference is triggered. Due to missing test coverage, this bug was
overlooked.

r249273 "[FIX] Approximate non-affine loops correctly" added code to disallow
switch statements for non-affine loops, if they appear in either a loop latch
or a loop exit. We adapt this code to now prohibit switch statements in
loop latches even if the control condition is affine.

We could possibly add support for switch statements in loop latches, but such
support should be evaluated and tested separately.

This fixes llvm.org/PR30952

Reported-by: Eli Friedman <efriedma@codeaurora.org>
llvm-svn: 286426
2016-11-10 05:20:29 +00:00
Tobias Grosser eba86a1208 ScopInfo: only run code needed for ASSERT in DEBUG mode
Suggested-by: Johannes Doerfert
llvm-svn: 286338
2016-11-09 04:24:49 +00:00
Tobias Grosser a8ca3ed06a SCEVValidator: reduce indentation to increase readability [NFC]
llvm-svn: 286217
2016-11-08 07:17:48 +00:00
Tobias Grosser 16480186f8 IslNodeBuilder: Ensure newly generated memory accesses are well-defined
Add some additional asserts that ensure newly code-generated memory accesses
are defined on all domain and schedule domain instances.

llvm-svn: 286050
2016-11-05 21:46:01 +00:00
Tobias Grosser 744740ad91 ScopInfo: Ensure copy statement memory accesses are correct
Add asserts that verify that the memory accesses of a new copy statement
are defined for all domain instances the copy statement is defined for.

llvm-svn: 286047
2016-11-05 21:02:43 +00:00
Tobias Grosser 9321e208e8 Update isl to isl-0.17.1-243-g24c0339
This introduces big-endian support in imath and resolves
http://llvm.org/PR24632.

llvm-svn: 285993
2016-11-04 11:56:48 +00:00
Hongbin Zheng ada8544dfb Remove POLLY_LINK_LIBS, it is not used
llvm-svn: 285976
2016-11-04 00:32:32 +00:00
Michael Kruse e1dc387731 [ScopInfo] Fix isl object leak.
Fix return from function without releasing isl objects, which was introduced
in r269055.

llvm-svn: 285924
2016-11-03 15:19:41 +00:00
Eli Friedman acf8006471 [Polly CodeGen] Break critical edge from RTC to original loop.
This makes polly generate a CFG which is closer to what we want
in LLVM IR, with a loop preheader for the original loop. This is
just a cleanup, but it exposes some fragile assumptions.

I'm not completely happy with the changes related to expandCodeFor;
RTCBB->getTerminator() is basically a random insertion point which
happens to work due to the way we generate runtime checks. I'm not
sure what the right answer looks like, though.

Differential Revision: https://reviews.llvm.org/D26053

llvm-svn: 285864
2016-11-02 22:32:23 +00:00
Eli Friedman b9c6f01a81 [ScopInfo] Make memset etc. affine where possible.
We don't actually check whether a MemoryAccess is affine in very many
places, but one important one is in checks for aliasing.

Differential Revision: https://reviews.llvm.org/D25706

llvm-svn: 285746
2016-11-01 20:53:11 +00:00
Eli Friedman 6768285dcc Add missing test from r284848.
Original commit title: [SCEVAffinator] Make precise modular math
more correct.

llvm-svn: 285745
2016-11-01 20:45:28 +00:00
Tobias Grosser ebb626e4b7 [ScopDetect] Use SCEVRewriteVisitor to simplify SCEVRemoveSMax rewriter
ScalarEvolution got at some pointer a SCEVRewriteVisitor. Use it to simplify
our SCEVRemoveSMax visitor.

llvm-svn: 285491
2016-10-29 06:19:34 +00:00
Michael Kruse 426e6f71f8 [ScopInfo] Fix: use raw source pointer.
When adding an llvm.memcpy instruction to AliasSetTracker, it uses the raw
source and target pointers which preserve bitcasts.
MemAccInst::getPointerOperand() also returns the raw target pointers, but
Scop::buildAliasGroups() did not for the source pointer. This lead to mismatches
between AliasSetTracker and ScopInfo on which pointer to use.

Fixed by also using raw pointers in Scop::buildAliasGroups().

llvm-svn: 285071
2016-10-25 13:37:43 +00:00
Mandeep Singh Grang 3642dd3ca4 [polly] Change SmallPtrSet which is being iterated to SmallSetVector in ScopInfo.h
Summary: This will avoid non-deterministic iteration order.

Reviewers: grosser, jdoerfert, zinob, mgrang

Subscribers: #polly

Tags: #polly

Differential Revision: https://reviews.llvm.org/D25880

llvm-svn: 284883
2016-10-21 21:00:11 +00:00
Eli Friedman 286c5a76ba [SCEVAffinator] Make precise modular math more correct.
Integer math in LLVM IR is modular. Integer math in isl is
arbitrary-precision. Modeling LLVM IR math correctly in isl requires
either adding assumptions that math doesn't actually overflow, or
explicitly wrapping the math. However, expressions with the "nsw" flag
are special; we can pretend they're arbitrary-precision because it's
undefined behavior if the result wraps. SCEV expressions based on IR
instructions with an nsw flag also carry an nsw flag (roughly; actually,
the real rule is a bit more complicated, but the details don't matter
here).

Before this patch, SCEV flags were also overloaded with an additional
function: the ZExt code was mutating SCEV expressions as a hack to
indicate to checkForWrapping that we don't need to add assumptions to
the operand of a ZExt; it'll add explicit wrapping itself. This kind of
works... the problem is that if anything else ever touches that SCEV
expression, it'll get confused by the incorrect flags.

Instead, with this patch, we make the decision about whether to
explicitly wrap the math a bit earlier, basing the decision purely on
the SCEV expression itself, and not its users.

Differential Revision: https://reviews.llvm.org/D25287

llvm-svn: 284848
2016-10-21 18:08:02 +00:00
Mandeep Singh Grang 48e7add80f [polly] Change SmallPtrSet which are being iterated into SmallSetVector
Summary: Otherwise the lack of an iteration order results in non-determinism in codegen.

Reviewers: _jdoerfert, zinob, grosser

Tags: #polly

Differential Revision: https://reviews.llvm.org/D25863

llvm-svn: 284845
2016-10-21 17:29:10 +00:00
Michael Kruse 3d0266b664 [cmake] Avoid warnings in feature tests. NFC.
Apply the __attribute__((unused)) before the function to unambiguously apply to
the function declaration.

Add more casts-to-void to mark return values unused as intended.

Contributed-by: Andy Gibbs <andyg1001@hotmail.co.uk>
llvm-svn: 284718
2016-10-20 11:16:19 +00:00
Tobias Grosser 69ed8542f5 Update isl to isl-0.17.1-236-ga9c6cc7
This includes isl_id_to_str, which is used in Michael's upcoming DeLICM patch.

llvm-svn: 284689
2016-10-20 01:59:24 +00:00
Mandeep Singh Grang 5b1abfc88e [polly] Fix non-determinism in polly BlockGenerators
Summary: Iterating over SeenBlocks which is a SmallPtrSet results in non-determinism in codegen

Reviewers: jdoerfert, zinob, grosser

Tags: #polly

Differential Revision: https://reviews.llvm.org/D25778

llvm-svn: 284622
2016-10-19 17:56:49 +00:00
Michael Kruse 6b87504973 [test] Fix buildbot after SCEV change.
Update test after commit r284501:
[SCEV] Make CompareValueComplexity a little bit smarter

Contributed-by: Sanjoy Das <sanjoy@playingwithpointers.com>
llvm-svn: 284543
2016-10-18 22:58:09 +00:00
Eli Friedman 3c1a75bf9c Handle multi-dimensional invariant load.
If the address of a load depends on another load, make sure to emit
the loads in the right order.

llvm-svn: 284426
2016-10-17 21:04:26 +00:00
Michael Kruse 6a19d592da [ScopDetect] Depend transitively on ScalarEvolution.
ScopDetection might be queried by -dot-scops or -view-scops passes for which
it accesses ScalarEvolution.

llvm-svn: 284385
2016-10-17 13:29:20 +00:00
Michael Kruse 2ddb279a39 [test] Add missing colon.
llvm-svn: 284349
2016-10-16 22:05:51 +00:00
Michael Kruse 17d5090532 [cmake] Add polly-isl-test dependency to lit tests.
Also handle the in-llvm-tree case forgotten in r284339.

llvm-svn: 284347
2016-10-16 21:35:57 +00:00