Commit Graph

1839 Commits

Author SHA1 Message Date
Johannes Doerfert c0ece9b67e [NFC] Generate runtime checks after the SCoP
We now generate runtime checks __after__ the SCoP code generation and
  not before, though they are still inserted at the same position int
  the code. This allows to modify the runtime check during SCoP code
  generation.

llvm-svn: 271894
2016-06-06 13:32:52 +00:00
Johannes Doerfert 4db8d80730 [FIX] Determine insertion point during SCEV expansion
llvm-svn: 271892
2016-06-06 13:05:21 +00:00
Johannes Doerfert 1a6b0f7f07 [NFC] Refactor assumption tracking interface
llvm-svn: 271890
2016-06-06 12:16:10 +00:00
Johannes Doerfert 6a6a671c72 [NFC] Simplify code
llvm-svn: 271889
2016-06-06 12:13:24 +00:00
Johannes Doerfert dedb7693ec Look through IntToPtr & PtrToInt instructions
IntToPtr and PtrToInt instructions are basically no-ops that we can handle as
  such. In order to generate them properly as parameters we had to improve the
  ScopExpander, though the change is the first in the direction of a more
  aggressive scalar synthetization.

llvm-svn: 271888
2016-06-06 12:12:27 +00:00
Johannes Doerfert b71900b89c [NFC] Simplify code
llvm-svn: 271886
2016-06-06 12:09:30 +00:00
Johannes Doerfert 4b2fd892ec [FIX] Do not recognize division by 0 as affine
llvm-svn: 271885
2016-06-06 12:08:34 +00:00
Johannes Doerfert f643785b14 Replace getSCEV with getSCEVAtScope
llvm-svn: 271881
2016-06-06 10:07:40 +00:00
Johannes Doerfert ba91a58e42 [NFC] Use the ScalarEvolution member of the SCEVAffinator
llvm-svn: 271880
2016-06-06 10:06:53 +00:00
Johannes Doerfert 48975276be [NFC] Coalesce invariant context sets early
llvm-svn: 271879
2016-06-06 10:06:07 +00:00
Johannes Doerfert 0767a511ba Use minimal types for generated expressions
We now use the minimal necessary bit width for the generated code. If
  operations might overflow (add/sub/mul) we will try to adjust the types in
  order to ensure a non-wrapping computation. If the type adjustment is not
  possible, thus the necessary type is bigger than the type value of
  --polly-max-expr-bit-width, we will use assumptions to verify the computation
  will not wrap. However, for run-time checks we cannot build assumptions but
  instead utilize overflow tracking intrinsics.

llvm-svn: 271878
2016-06-06 09:57:41 +00:00
Roman Gareev ba0fb97c0a [NFC] Check that a parameter of ScheduleTreeOptimizer::isMatrMultPattern contains a correct partial schedule
llvm-svn: 271780
2016-06-04 06:34:04 +00:00
Michael Kruse 5c527f9963 Fix modulo compared to zero.
In case of modulo compared to zero, we need to do signed modulo
operation as unsigned can give different results based on whether the
dividend is negative or not.

This addresses llvm.org/PR27707

Contributed-by: Chris Jenneisch <chrisj@codeaurora.org>

Reviewers: _jdoerfert, grosser, Meinersbur

Differential Revision: http://reviews.llvm.org/D20145

llvm-svn: 271707
2016-06-03 18:51:48 +00:00
Roman Gareev 4b8c7aeb62 [FIX] Fix potential issue related to subtraction from an unsigned 0 in circularShiftOutputDims
Reported-by: Mehdi Amini <mehdi.amini@apple.com>
Contributed-by: Michael Kruse <llvm@meinersbur.de>

Differential Revision: http://reviews.llvm.org/D20969

llvm-svn: 271705
2016-06-03 18:46:29 +00:00
Johannes Doerfert 6393ef135c Temporarily promote values to i64 again
Operands of binary operations that might overflow will be temporarily
  promoted to i64 again, though that is not a sound solution for the problem.

llvm-svn: 271538
2016-06-02 17:09:22 +00:00
Johannes Doerfert 4cf79d4ca4 [NFC] Avoid unnecessary comparison for min/max expressions
llvm-svn: 271535
2016-06-02 16:58:12 +00:00
Johannes Doerfert 6631bfdd1c [FIX] Correctly translate i1 expressions
llvm-svn: 271534
2016-06-02 16:57:12 +00:00
Johannes Doerfert 06445deda4 Simplify the schedule domain according to the context
llvm-svn: 271522
2016-06-02 15:07:41 +00:00
Johannes Doerfert e86a551618 [NFC] Rename ScopInfo to ScopBuilder
Contributed-by: Utpal Bora <cs14mtech11017@iith.ac.in>
  Reviewed-by: Michael Kruse <meinersbur@googlemail.com>
               Johannes Doerfert <doerfert@cs.uni-saarland.de>

  Differential Revision: http://reviews.llvm.org/D20831

llvm-svn: 271521
2016-06-02 14:36:34 +00:00
Matthew Simpson acae9e3b30 [Polly] Fix -Wunused-variable warnings (NFC)
llvm-svn: 271518
2016-06-02 14:26:38 +00:00
Johannes Doerfert 47f15f6d7e [NFC] Simplify min/max expression generation
llvm-svn: 271514
2016-06-02 11:20:52 +00:00
Johannes Doerfert d36553753e Simplify the type adjustment in the IslExprBuilder
We now have a simple function to adjust/unify the types of two (or three)
  operands before an operation that requieres the same type for all operands.
  Due to this change we will not promote parameters that are added to i64
  anymore if that is not needed.

llvm-svn: 271513
2016-06-02 11:15:57 +00:00
Johannes Doerfert a91c85a5b9 [FIX] Ensure wrapping checks for unary expressions
llvm-svn: 271512
2016-06-02 11:08:43 +00:00
Johannes Doerfert 5210da5897 Bail early for complex alias checks
llvm-svn: 271511
2016-06-02 11:06:54 +00:00
Roman Gareev 76614d3ed9 [GSoC 2016] [Polly] [FIX] Determination of statements that contain matrix
multiplication

Fix small issues related to characters, operators  and descriptions of tests.

Differential Revision: http://reviews.llvm.org/D20806

llvm-svn: 271264
2016-05-31 11:22:21 +00:00
Johannes Doerfert 99191c78c2 Decouple SCoP building logic from pass
Created a new pass ScopInfoRegionPass. As name suggests, it is a
  region pass and it is there to preserve compatibility with our
  existing Polly passes.  ScopInfoRegionPass will return a SCoP object
  for a valid region while the creation of the SCoP stays in the
  ScopInfo class.

  Contributed-by: Utpal Bora <cs14mtech11017@iith.ac.in>
  Reviewed-by: Tobias Grosser <tobias@grosser.es>,
               Johannes Doerfert <doerfert@cs.uni-saarland.de>

Differential Revision: http://reviews.llvm.org/D20770

llvm-svn: 271259
2016-05-31 09:41:04 +00:00
Michael Kruse 7410a27820 MSVC compile fix: #include <ciso646>. NFC.
This header is required to make the ISO 646 alternative operator
spellings ("and", "or" instead of "&&", "||") work. Should these
operators be replaced by the standard ones as already suggested by
Johannes, also remove this #include again.

llvm-svn: 271206
2016-05-30 14:27:14 +00:00
Sanjoy Das 03bcb910de [Polly] Remove usage of the `apply` function
Summary:
API-wise `apply` is a somewhat unidiomatic one-off function, and
removing the only(?) use in polly will let me remove it from SCEV's
exposed interface.

Reviewers: jdoerfert, Meinersbur, grosser

Subscribers: grosser, mcrosier, pollydev

Differential Revision: http://reviews.llvm.org/D20779

llvm-svn: 271177
2016-05-29 07:33:16 +00:00
Roman Gareev 9c3eb5960a Determination of statements that contain matrix multiplication
Add determination of statements that contain, in particular,
matrix multiplications and can be optimized with [1] to try to
get close-to-peak performance. It can be enabled
via polly-pm-based-opts, which is false by default.

Refs:
[1] - http://www.cs.utexas.edu/users/flame/pubs/TOMS-BLIS-Analytical.pdf

Contributed-by: Roman Gareev <gareevroman@gmail.com>
Reviewed-by: Tobias Grosser <tobias@grosser.es>

Differential Revision: http://reviews.llvm.org/D20575

llvm-svn: 271128
2016-05-28 16:17:58 +00:00
Michael Kruse 1007182cf7 [ScopInfo] Change removeMemoryAccesses to remove only one access. NFC.
This exposes the more basic operation for use by code not related to
invariant code hoisting.

llvm-svn: 270438
2016-05-23 14:45:58 +00:00
Michael Kruse 996fb611b3 Remove some unused local variables. NFC.
Found by clang static analyzer (http://llvm.org/reports/scan-build/)
and Visual Studio.

llvm-svn: 270432
2016-05-23 13:00:41 +00:00
Johannes Doerfert 0f0d209bec Use the SCoP directly for canSynthesize [NFC]
llvm-svn: 270429
2016-05-23 12:47:09 +00:00
Johannes Doerfert 57a7317fb8 Simplify ScopInfo function interfaces [NFC]
llvm-svn: 270428
2016-05-23 12:45:17 +00:00
Johannes Doerfert e0b08077bf Allow to check for dominance wrt. a SCoP [NFC]
llvm-svn: 270427
2016-05-23 12:43:44 +00:00
Johannes Doerfert ef74443c97 Duplicate part of the Region interface in the Scop class [NFC]
This allows to use the SCoP directly for various queries,
  thus to hide the underlying region more often.

llvm-svn: 270426
2016-05-23 12:42:38 +00:00
Johannes Doerfert 952b5304bc Add and use Scop::contains(Loop/BasicBlock/Instruction) [NFC]
llvm-svn: 270424
2016-05-23 12:40:48 +00:00
Johannes Doerfert 3f52e35471 Directly access information through the Scop class [NFC]
llvm-svn: 270421
2016-05-23 12:38:05 +00:00
Johannes Doerfert 25227fe7b0 Optimistic assume required invariant loads to be invariant
Before this patch we bailed if a required invariant load was potentially
  overwritten. However, now we will optimistically assume it is actually
  invariant and, to this end, restrict the valid parameter space as well as the
  execution context with regards to potential overwrites of the location.

llvm-svn: 270416
2016-05-23 10:40:54 +00:00
Johannes Doerfert 764b7e66f0 [FIX] Require base pointers of loads that might alias to be hoisted
Since the base pointer of a possibly aliasing pointer might not alias
  with any other pointer it (the base pointer) might not be tagged as
  "required invariant". However, we need it do be in order to compare
  the accessed addresses of the derived (possibly aliasing) pointer.

  This patch also tries to clean up the load hoisting a little bit.

llvm-svn: 270412
2016-05-23 09:26:46 +00:00
Johannes Doerfert 38a012c46b Simplify BlockGenerator::handleOutsideUsers interface [NFC]
llvm-svn: 270411
2016-05-23 09:14:07 +00:00
Johannes Doerfert 1dafea4114 Make the detection context non-constant [NFC]
llvm-svn: 270410
2016-05-23 09:07:08 +00:00
Johannes Doerfert a61eda7698 [FIX] Let ScalarEvolution forget hoisted values
We have to rethink the handling of escaping values in order to make
  this kind of "fixes" go away.

llvm-svn: 270409
2016-05-23 09:02:54 +00:00
Johannes Doerfert 1a4ad8f771 [FIX] Synthezise Sdiv/Srem/Udiv instructions correctly.
This patch simplifies the Sdiv/Srem/Udiv expansion and thereby
  prevents errors, e.g., regarding the insertion point.

llvm-svn: 270408
2016-05-23 08:55:43 +00:00
Johannes Doerfert cda1bd5048 Revert "Optimistic assume required invariant loads to be invariant"
This reverts commit 787e642207ca978f2e800140529fc7049ea1f3de until the
lnt failures are fixed.

llvm-svn: 270061
2016-05-19 13:47:34 +00:00
Johannes Doerfert cb77542d1c Optimistic assume required invariant loads to be invariant
So far we bailed if a required invariant load was potentially overwritten in
  the SCoP. From now on we will optimistically assume it is actually invariant
  and, to this end, restrict the valid parameter space.

llvm-svn: 270060
2016-05-19 13:24:10 +00:00
Johannes Doerfert 469db6a247 Move internal enum out of class declaration [NFC]
llvm-svn: 270054
2016-05-19 12:36:43 +00:00
Johannes Doerfert ffd222f2d6 Propagate the DetectionContext to the SCoP [NFC]
The SCoP now holds a reference to the ScopDetection::DetectionContext
  which allows to simplify the type of various methods and remove code.

llvm-svn: 270053
2016-05-19 12:34:57 +00:00
Johannes Doerfert 60dd9e1346 Compute the MaxLoopDepth during domain construction [NFC]
llvm-svn: 270052
2016-05-19 12:33:14 +00:00
Johannes Doerfert f5841a66af Remove leftover debug output [NFC]
llvm-svn: 270051
2016-05-19 12:32:54 +00:00
Johannes Doerfert 6dc3616195 Remove unsused methodes [NFC]
llvm-svn: 270050
2016-05-19 12:31:16 +00:00
Johannes Doerfert e6e3c9246a Check late for profitability
Before this patch we only expanded valid __and__ profitable region. Therefor
  we did not allow the expansion to create a profitable region from a
  non-profitable one.  With this patch we will remember and expand all valid
  regions and check for profitability only at the end.

  This patch increases the number of valid SCoPs in the LLVM-TS and SPEC
  2000/2006 by 28% (from 303 to 390), including the hot loop in hmmer.

llvm-svn: 269343
2016-05-12 20:21:50 +00:00
Johannes Doerfert 6c7639b380 Cleanup rejection log handling [NFC]
This patch cleans up the rejection log handling during the
  ScopDetection. It consists of two interconnected parts:
    - We keep all detection contexts for a function in order to provide
      more information to the user, e.g., about the rejection of
      extended/intermediate regions.
    - We remove the mutable "RejectLogs" member as the information is
      available through the detection contexts.

llvm-svn: 269323
2016-05-12 18:50:01 +00:00
Johannes Doerfert 5c2b556b13 Bring some comments up to date [NFC]
llvm-svn: 269301
2016-05-12 15:15:50 +00:00
Johannes Doerfert 6f1bb7a9d9 Support truncate operations
Truncate operations are basically modulo operations, thus we can model
  them that way. However, for large types we assume the operand to fit
  in the new type size instead of introducing a modulo with a very large
  constant.

llvm-svn: 269300
2016-05-12 15:13:49 +00:00
Johannes Doerfert 404a0f81ea Check overflows in RTCs and bail accordingly
We utilize assumptions on the input to model IR in polyhedral world.
  To verify these assumptions we version the code and guard it with a
  runtime-check (RTC). However, since the RTCs are themselves generated
  from the polyhedral representation we generate them under the same
  assumptions that they should verify. In other words, the guarantees
  that we try to provide with the RTCs do not hold for the RTCs
  themselves. To this end it is necessary to employ a different check
  for the RTCs that will verify the assumptions did hold for them too.

Differential Revision: http://reviews.llvm.org/D20165

llvm-svn: 269299
2016-05-12 15:12:43 +00:00
Johannes Doerfert 27d12d3d1f Invalidate unprofitable SCoPs after creation
If a profitable run is performed we will check if the SCoP seems to be
  profitable after creation but before e.g., dependence are computed. This is
  needed as SCoP detection only approximates the actual SCoP representation.
  In the end this should allow us to be less conservative during the SCoP
  detection while keeping the compile time in check.

llvm-svn: 269074
2016-05-10 16:38:09 +00:00
Johannes Doerfert bf9473b2d8 Weaken profitability constraints during ScopDetection
Regions with one affine loop can be profitable if the loop is
  distributable. To this end we will allow them to be treated as
  profitable if they contain at least two non-trivial basic blocks.

llvm-svn: 269064
2016-05-10 14:42:30 +00:00
Johannes Doerfert ede4ecaefb [FIX] Cleanup isl objects prior to early exit
llvm-svn: 269061
2016-05-10 14:01:21 +00:00
Johannes Doerfert 2b92a0e4ee Handle llvm.assume inside the SCoP
The assumption attached to an llvm.assume in the SCoP needs to be
  combined with the domain of the surrounding statement but can
  nevertheless be used to refine the context.

  This fixes the problems mentioned in PR27067.

llvm-svn: 269060
2016-05-10 14:00:57 +00:00
Johannes Doerfert 297c720d15 Propagate complexity problems during domain generation [NFC]
This patches makes the propagation of complexity problems during
  domain generation consistent. Additionally, it makes it less likely to
  encounter ill-formed domains later, e.g., during schedule generation.

llvm-svn: 269055
2016-05-10 13:06:42 +00:00
Johannes Doerfert 14b1cf35b5 [FIX] Create error-restrictions late
Before this patch we generated error-restrictions only for
  error-blocks, thus blocks (or regions) containing a not represented
  function call. However, the same reasoning is needed if the invalid
  domain of a statement subsumes its actual domain. To this end we move
  the generation of error-restrictions after the propagation of the
  invalid domains. Consequently, error-statements are now defined more
  general as statements that are assumed to be not executed.
  Additionally, we do not record an empty domain for such statements but
  a nullptr instead. This allows to distinguish between error-statements
  and dead-statements.

llvm-svn: 269053
2016-05-10 12:42:26 +00:00
Johannes Doerfert 2640454d1c Refactor simplifySCoP [NFC]
Remove obsolete code and decrease the indention in the
  Scop::simplifySCoP() function.

llvm-svn: 269049
2016-05-10 12:19:47 +00:00
Johannes Doerfert a60ad845c0 Simplify the internal representation according to the context [NFC]
We now use context information to simplify the domains and access
  functions of the SCoP instead of just aligning them with the parameter
  space.

llvm-svn: 269048
2016-05-10 12:18:22 +00:00
Johannes Doerfert e243753a4d Simplify access relation for invariant loads early [NFC]
llvm-svn: 269046
2016-05-10 11:59:59 +00:00
Johannes Doerfert 5f173d414e Prevent complex access ranges with low number of pieces.
Previously we checked the number of pieces to decide whether or not a
  invariant load was to complex to be generated. However, there are
  cases when e.g., divisions cause the complexity to spike regardless of
  the number of pieces. To this end we now check the number of totally
  involved dimensions which will increase with the number of pieces but
  also the number of divisions.

llvm-svn: 269045
2016-05-10 11:46:57 +00:00
Johannes Doerfert 56b377644a Expose interpretAsUnsigned in the SCEVAffinator [NFC]
This exposes the functionality to interpret a SCEV, or better the
  piece-wise function created from the SCEV, as an unsigned value
  instead of a signed one.

llvm-svn: 269044
2016-05-10 11:45:46 +00:00
Tobias Grosser 1022ca5646 Codegen: Enable the detection of min/max expressions
Min/max expressions are easier to read and can in some cases also result in
more concise IR that is generated as the min/max --- when lowered to a
cmp+select pattern -- commonly has a simpler condition then the ternary
condition isl would normally generate.

llvm-svn: 268855
2016-05-07 08:03:44 +00:00
Tobias Grosser 6b49f17764 Update isl to isl-0.17-5-g57dc5ff
This update fixes an assertion in the isl scheduler.

llvm-svn: 268853
2016-05-07 07:41:25 +00:00
Michael Kruse e0b34f366f Update to ISL 0.17.
This release includes sevaral improvments compared to the previous
version isl-0.16.1-145-g243bf7c (from the ISL 0.17 announcement):
- optionally combine SCCs incrementally in scheduler
- optionally maximize coincidence in scheduler
- optionally avoid loop coalescing in scheduler
- minor AST generator improvements
- improve support for expansions in schedule trees

llvm-svn: 268500
2016-05-04 14:41:36 +00:00
Michael Kruse f7a4a94d05 Typo: ToComplex -> TooComplex. NFC.
llvm-svn: 268224
2016-05-02 12:25:36 +00:00
Michael Kruse bc150127ae Rename Conjuncts -> Disjunctions. NFC.
The check for complexity compares the number of polyhedra in a set,
which are combined by disjunctions (union, "OR"),
not conjunctions (intersection, "AND").

llvm-svn: 268223
2016-05-02 12:25:18 +00:00
Michael Kruse 315aa3278e [ScheduleOptimizer] Add -polly-opt-outer-coincidence option.
Add a command line switch to set the
isl_options_set_schedule_outer_coincidence option. ISL then tries to
build schedules where the outer member of a band satisfies the
coincidence constraints.

In practice this allows loop skewing for more parallelism in inner
loops.

llvm-svn: 268222
2016-05-02 11:35:27 +00:00
Michael Kruse 2d3ff2a5ba Typo: isToComplex -> isTooComplex. NFC.
llvm-svn: 268220
2016-05-02 10:44:20 +00:00
Johannes Doerfert 172dd8b923 Allow unsigned divisions
After zero-extend operations and unsigned comparisons we now allow
  unsigned divisions. The handling is basically the same as for signed
  division, except the interpretation of the operands. As the divisor
  has to be constant in both cases we can simply interpret it as an
  unsigned value without additional complexity in the representation.
  For the dividend we could choose from the different representation
  schemes introduced for zero-extend operations but for now we will
  simply use an assumption.

llvm-svn: 268032
2016-04-29 11:53:35 +00:00
Johannes Doerfert ba9725ff41 Refactor SCEVAffinator [NFC]
llvm-svn: 268031
2016-04-29 11:52:30 +00:00
Tobias Grosser 2937b59393 ScopInfo: Add option to control abort on isl errors
For debugging it is often convenient to not abort at the very first memory
management error. This option allows to control this behavior at run-time.

llvm-svn: 268030
2016-04-29 11:43:20 +00:00
Johannes Doerfert 3e48ee2ab9 [FIX] Unsigned comparisons change invalid domain
It does not suffice to take a global assumptions for unsigned comparisons but
  we also need to adjust the invalid domain of the statements guarded by such
  an assumption. To this end we allow to specialize the getPwAff call now in
  order to indicate unsigned interpretation.

llvm-svn: 268025
2016-04-29 10:44:41 +00:00
Johannes Doerfert bfaa63a82e [FIX] Prevent division/modulo by zero in parameters
When we materialize parameter SCEVs we did so without considering the
  side effects they might have, e.g., both division and modulo are
  undefined if the right hand side is zero. This is a problem because we
  potentially extended the domain under which we evaluate parameters,
  thus we might have introduced such undefined behaviour. To prevent
  that from happening we will now guard divisions and modulo operations
  in the parameters with a compare and select.

llvm-svn: 268023
2016-04-29 10:36:58 +00:00
Johannes Doerfert 8475d1c163 [FIX] Correct assumption simplification
Assumptions and restrictions can both be simplified with the domain of a
  statement but not the same way. After this patch we will correctly
  distinguish them.

llvm-svn: 267885
2016-04-28 14:32:58 +00:00
Tobias Grosser 2e27a0f5fd BlockGenerator: Drop leftover debug statement
llvm-svn: 267874
2016-04-28 12:31:05 +00:00
Johannes Doerfert 8ab2803b63 [FIX] Propagate execution domain of invariant loads
If the base pointer of an invariant load is is loaded conditionally, that
  condition needs to hold for the invariant load too. The structure of the
  program will imply this for domain constraints but not for imprecisions in
  the modeling. To this end we will propagate the execution context of base
  pointers during code generation and thus ensure the derived pointer does
  not access an invalid base pointer.

llvm-svn: 267707
2016-04-27 12:49:11 +00:00
Johannes Doerfert 792374b941 Allow unsigned comparisons
With this patch we will optimistically assume that the result of an unsigned
  comparison is the same as the result of the same comparison interpreted as
  signed.

llvm-svn: 267559
2016-04-26 14:33:12 +00:00
Johannes Doerfert 323ab3975b [FIX] Adjust assumption space for zext instructions
llvm-svn: 267552
2016-04-26 12:44:01 +00:00
Johannes Doerfert b2885799d1 Do not use the number of parameters in the complexity check
llvm-svn: 267532
2016-04-26 09:20:41 +00:00
Johannes Doerfert 625bb1fc10 Do not add but record signed-unsigned assumptions
llvm-svn: 267528
2016-04-26 09:16:36 +00:00
Johannes Doerfert 9cc8340fea Extract some constant factors from "SCEVAddExprs"
Additive expressions can have constant factors too that we can extract
  and thereby simplify the internal representation. For now we do
  compute the gcd of all constant factors but only extract the same
  (possibly negated) factor if there is one.

llvm-svn: 267445
2016-04-25 19:09:10 +00:00
Johannes Doerfert d5c369f460 Do not check all GEPs for assumptions
Before, we checked all GEPs in a statement in order to derive
  out-of-bound assumptions. However, this can not only introduce new
  parameters but it is also not clear what we can learn from GEPs that
  are not immediately used in a memory accesses inside the SCoP. As this
  case is very rare, no actual change in the behaviour is expected.

llvm-svn: 267442
2016-04-25 18:55:15 +00:00
Johannes Doerfert c78ce7dc21 Only add user assumptions on known parameters [NFC]
Before, assumptions derived from llvm.assume could reference new
  parameters that were not known to the SCoP before. These were neither
  beneficial to the representation nor to the user that reads the
  emitted remark. Now we project them out and keep only user assumptions
  on known parameters. Nevertheless, the new parameters are still part
  of the SCoPs parameter space as the SCEVAffinator currently adds them
  on demand.

llvm-svn: 267441
2016-04-25 18:51:27 +00:00
Johannes Doerfert 4e3bb7b98c Refactor Scop parameter handling
The new handling is consistent with the remaining code, e.g., we do
  not create a new parameter id for each lookup call but copy an
  existing one. Additionally, we now use the implicit order defined by
  the Parameters set instead of an explicit one defined in a map.

llvm-svn: 267423
2016-04-25 16:15:13 +00:00
Johannes Doerfert c3596284c3 Model zext-extend instructions
A zero-extended value can be interpreted as a piecewise defined signed
  value. If the value was non-negative it stays the same, otherwise it
  is the sum of the original value and 2^n where n is the bit-width of
  the original (or operand) type. Examples:
    zext i8 127 to i32 -> { [127] }
    zext i8  -1 to i32 -> { [256 + (-1)] } = { [255] }
    zext i8  %v to i32 -> [v] -> { [v] | v >= 0; [256 + v] | v < 0 }

  However, LLVM/Scalar Evolution uses zero-extend (potentially lead by a
  truncate) to represent some forms of modulo computation. The left-hand side
  of the condition in the code below would result in the SCEV
  "zext i1 <false, +, true>for.body" which is just another description
  of the C expression "i & 1 != 0" or, equivalently, "i % 2 != 0".

    for (i = 0; i < N; i++)
      if (i & 1 != 0 /* == i % 2 */)
        /* do something */

  If we do not make the modulo explicit but only use the mechanism described
  above we will get the very restrictive assumption "N < 3", because for all
  values of N >= 3 the SCEVAddRecExpr operand of the zero-extend would wrap.
  Alternatively, we can make the modulo in the operand explicit in the
  resulting piecewise function and thereby avoid the assumption on N. For the
  example this would result in the following piecewise affine function:
  { [i0] -> [(1)] : 2*floor((-1 + i0)/2) = -1 + i0;
    [i0] -> [(0)] : 2*floor((i0)/2) = i0 }
  To this end we can first determine if the (immediate) operand of the
  zero-extend can wrap and, in case it might, we will use explicit modulo
  semantic to compute the result instead of emitting non-wrapping assumptions.

  Note that operands with large bit-widths are less likely to be negative
  because it would result in a very large access offset or loop bound after the
  zero-extend. To this end one can optimistically assume the operand to be
  positive and avoid the piecewise definition if the bit-width is bigger than
  some threshold (here MaxZextSmallBitWidth).

  We choose to go with a hybrid solution of all modeling techniques described
  above. For small bit-widths (up to MaxZextSmallBitWidth) we will model the
  wrapping explicitly and use a piecewise defined function. However, if the
  bit-width is bigger than MaxZextSmallBitWidth we will employ overflow
  assumptions and assume the "former negative" piece will not exist.

llvm-svn: 267408
2016-04-25 14:01:36 +00:00
Johannes Doerfert 517d8d2f94 Check only loop control of loops that are part of the region
This also removes a duplicated line of code in the region generator
  that caused a SPEC benchmark to fail with the new SCoPs.

llvm-svn: 267404
2016-04-25 13:37:24 +00:00
Johannes Doerfert a4dd8ef40f Initialize the invalid domain of an access with an empty set
llvm-svn: 267403
2016-04-25 13:36:23 +00:00
Johannes Doerfert e4459a24cc Do not propagate invalid domains over back edges
llvm-svn: 267402
2016-04-25 13:34:50 +00:00
Johannes Doerfert f560b3d2db Introduce a parameter set type [NFC]
llvm-svn: 267401
2016-04-25 13:33:07 +00:00
Johannes Doerfert ec8a217729 Remove unnecessary argument of the SCEVValidator [NFC]
llvm-svn: 267400
2016-04-25 13:32:36 +00:00
Johannes Doerfert 85676e3674 Add an invalid domain to memory accesses
Memory accesses can have non-precisely modeled access functions that
  would cause us to build incorrect execution context for hoisted loads.
  This is the same issue that occurred during the domain construction for
  statements and it is dealt with the same way.

llvm-svn: 267289
2016-04-23 14:32:34 +00:00
Johannes Doerfert ac9c32e216 Translate SCEVs to isl_pw_aff and their invalid domain
The SCEVAffinator will now produce not only the isl representaiton of
  a SCEV but also the domain under which it is invalid. This is used to
  record possible overflows that can happen in the statement domains in
  the statements invalid domain. The result is that invalid loads have
  an accurate execution contexts with regards to the validity of their
  statements domain. While the SCEVAffinator currently is only taking
  "no-wrapping" assumptions, we can add more withouth worrying about the
  execution context of loads that are optimistically hoisted.

llvm-svn: 267288
2016-04-23 14:31:17 +00:00
Johannes Doerfert a3519515b5 Track invalid domains not invalid contexts for statements
The invalid context is not enough to describe the parameter constraints under
  which a statement is not modeled precisely. The reason is that during the
  domain construction the bounds on the induction variables are not known but
  needed to check if e.g., an overflow can actually happen. To this end we
  replace the invalid context of a statement with an invalid domain. It is
  initialized during domain construction and intersected with the domain once
  it was completely build. Later this invalid domain allows to eliminate
  falsely assumed wrapping cases and other falsely assumed mismatches in the
  modeling.

llvm-svn: 267286
2016-04-23 13:02:23 +00:00
Johannes Doerfert 94341c996d Improve accuracy of Scop::hasFeasibleRuntimeContext
If the AssumptionContext is a subset of the InvalidContext the runtime
  context is not feasible.

llvm-svn: 267285
2016-04-23 13:00:27 +00:00
Johannes Doerfert 1dc12aff8a Simplify the execution context for dereferencable loads
If we know it is safe to execute a load we do not need an execution
  context, however only if we are sure it was modeled correctly.

llvm-svn: 267284
2016-04-23 12:59:18 +00:00
Johannes Doerfert f4f1d9a5cf Remove simplification calls for the execution domain [NFC]
These calls were sometimes costly and do not show any improvements on our
  small test cases.

llvm-svn: 267283
2016-04-23 12:56:58 +00:00
Johannes Doerfert d77089e62d Bail for complex execution contexts of invariant loads
llvm-svn: 267146
2016-04-22 11:41:14 +00:00
Johannes Doerfert 5d03f84cf5 Early exit for addInvariantLoads
llvm-svn: 267143
2016-04-22 11:38:44 +00:00
Johannes Doerfert 6296d95420 Bail for complex alias checks
llvm-svn: 267142
2016-04-22 11:38:19 +00:00
Johannes Doerfert 171b92f1e1 Relate domains to statements during construction [NFC]
Instead of the Scop::getPwAff() function we now use the ScopStmt::getPwAff()
  function during the statements domain construction.

llvm-svn: 266741
2016-04-19 14:53:13 +00:00
Johannes Doerfert ff68f46458 Add user assumptions after domain generation [NFC]
llvm-svn: 266740
2016-04-19 14:49:42 +00:00
Johannes Doerfert 535de03571 Do not build domains for out of SCoP blocks [NFC]
llvm-svn: 266739
2016-04-19 14:49:05 +00:00
Johannes Doerfert fff283df7a Mark Scop::getDomainConditions as const [NFC]
llvm-svn: 266738
2016-04-19 14:48:22 +00:00
Tobias Grosser 90303f872d SCoPValidator: Use SCEVTraversal to simplify SCEVInRegionDependences
llvm-svn: 266622
2016-04-18 15:46:27 +00:00
Johannes Doerfert fb72187fdd [FIX] Check the invalid context agains the context to rule out SCoPs
llvm-svn: 266096
2016-04-12 17:54:29 +00:00
Johannes Doerfert 2f70584ae6 Do not by default minimize remarks
We used checks to minimize the number of remarks we present to a user
  but these checks can become expensive, especially since all wrapping
  assumptions are emitted separately. Because there is not benefit for a
  "headless" run we put these checks under a command line flag. Thus, if
  the flag is not given we will emit "non-effective" remarks, e.g.,
  duplicates and revert to the old behaviour if it is given. As this
  also changes the internal representation of some sets we set the flag
  by default for our unit tests.

llvm-svn: 266087
2016-04-12 16:09:44 +00:00
Johannes Doerfert 615e0b85f8 Record wrapping assumptions early
Utilizing the record option for assumptions we can simplify the wrapping
  assumption generation a lot. Additionally, we can now report locations
  together with wrapping assumptions, though they might not be accurate yet.

llvm-svn: 266069
2016-04-12 13:28:39 +00:00
Johannes Doerfert 3bf6e4129f Record assumptions first and add them later
There are three reasons why we want to record assumptions first before we
add them to the assumed/invalid context:

  1) If the SCoP is not profitable or otherwise invalid without the
     assumed/invalid context we do not have to compute it.
  2) Information about the context are gathered rather late in the SCoP
     construction (basically after we know all parameters), thus the user
     might see overly complicated assumptions to be taken while they would
     have been simplified later on.
  3) Currently we cannot take assumptions at any point but have to wait,
     e.g., for the domain generation to finish. This makes wrapping
     assumptions much more complicated as they need to be and it will
     have a similar effect on "signed-unsigned" assumptions later.

llvm-svn: 266068
2016-04-12 13:27:35 +00:00
Johannes Doerfert 97f0dcdea8 Introduce and use MemoryAccess::getPwAff() [NFC]
llvm-svn: 266066
2016-04-12 13:26:45 +00:00
Johannes Doerfert 127abd77a3 Do not assume switch modeling optimizes a SCoP
llvm-svn: 266065
2016-04-12 13:25:43 +00:00
Johannes Doerfert 7c01357cef Introduce an invalid context for each statement
Collect the error domain contexts (formerly in the ErrorDomainCtxMap)
  for each statement in the new InvalidContext member variable. While
  this commit is basically a [NFC] it is a first step to make hoisting
  sound by allowing a more fine grained record of invalid contexts,
  e.g., here on statement level.

llvm-svn: 266053
2016-04-12 09:57:34 +00:00
Johannes Doerfert 65f86cd8b0 Simplify SCEVAffinator code [NFC]
llvm-svn: 266051
2016-04-12 09:33:47 +00:00
Michael Kruse 3b425ff232 Allow overflow of indices with constant dim-sizes.
Allow overflow of indices into the next higher dimension if it has
constant size. E.g.

    float A[32][2];
    ((float*)A)[5];

is effectively the same as

    A[2][1];

This can happen since r265379 as a side effect if ScopDetection
recognizes an access as affine, but ScopInfo rejects the GetElementPtr.

Differential Revision: http://reviews.llvm.org/D18878

llvm-svn: 265942
2016-04-11 14:34:08 +00:00
Michael Kruse 7071e8b355 Do not bind a non-const reference to a rvalue. NFC.
MSVC warns with:
warning C4239: nonstandard extension used: 'initializing': conversion from 'llvm::DebugLoc' to 'llvm::DebugLoc &'
note: A non-const reference may only be bound to an lvalue

Change the reference to a const reference.

llvm-svn: 265937
2016-04-11 13:24:29 +00:00
Johannes Doerfert 561d36b320 Allow pointer expressions in SCEVs again.
In r247147 we disabled pointer expressions because the IslExprBuilder did not
  fully support them. This patch reintroduces them by simply treating them as
  integers. The only special handling for pointers that is left detects the
  comparison of two address_of operands and uses an unsigned compare.

llvm-svn: 265894
2016-04-10 09:50:10 +00:00
Johannes Doerfert fbb63b8028 [FIX] Do not allow select as a base pointer in the SCoP region
llvm-svn: 265884
2016-04-09 21:57:13 +00:00
Johannes Doerfert 81c41b99a7 Do not allow exception handling code in SCoPs
llvm-svn: 265883
2016-04-09 21:55:58 +00:00
Johannes Doerfert 3c6a99b818 Add __isl_give annotations to return types [NFC]
llvm-svn: 265882
2016-04-09 21:55:23 +00:00
Johannes Doerfert b3410db2b7 [FIX] Do not recompute SCEVs but pass them to subfunctions
This reverts commit 2879c53e80e05497f408f21ce470d122e9f90f94.
  Additionally, it adds SDiv and SRem instructions to the set of values
  discovered by the findValues function even if we add the operands to
  be able to recompute the SCEVs. In subfunctions we do not want to
  recompute SDiv and SRem instructions but pass them instead as they
  might have been created through the IslExprBuilder and are more
  complicated than simple SDiv/SRem instructions in the code.

llvm-svn: 265873
2016-04-09 14:30:11 +00:00
Johannes Doerfert 41725a1e7a [FIX] Do not crash on opaque (unsized) types.
llvm-svn: 265834
2016-04-08 19:20:03 +00:00
Johannes Doerfert 5155edc658 [FIX] Teach the ScopExpander about parallel subfunctions
llvm-svn: 265824
2016-04-08 18:16:58 +00:00
Johannes Doerfert a9dc529442 Collect and verify generated parallel subfunctions
We verify the optimized function now for a long time and it helped to track
  down bugs early. This will now also happen for all parallel subfunctions we
  generate.

llvm-svn: 265823
2016-04-08 18:16:02 +00:00
Michael Kruse 436c90619c [ScopInfo] Fix check for element size mismatch.
The way to get the elements size with getPrimitiveSizeInBits() is not
the same as used in other parts of Polly which should use
DataLayout::getTypeAllocSize(). Its use only queries the size of the
pointer and getPrimitiveSizeInBits returns 0 for types that require a
DataLayout object such as pointers.

Together with r265379, this should fix PR27195.

llvm-svn: 265795
2016-04-08 16:20:08 +00:00
Michael Kruse 1fdc2fff1a [ScopInfo] Rename variable to AccType. NFC.
This avoids a name clash with the type llvm::Type.

llvm-svn: 265788
2016-04-08 14:35:59 +00:00
Johannes Doerfert 41cda15940 [FIX] Allow to lookup domains for non-affine subregion blocks
llvm-svn: 265779
2016-04-08 10:32:26 +00:00
Johannes Doerfert 3ef78d6d38 [FIX] Adjust execution context of hoisted loads wrt. error domains
If we build the domains for error blocks and later remove them we lose
  the information that they are not executed. Thus, in the SCoP it looks
  like the control will always reach the statement S:

            for (i = 0 ... N)
                if (*valid == 0)
                  doSth(&ptr);
          S:    A[i] = *ptr;

  Consequently, we would have assumed "ptr" to be always accessed and
  preloaded it unconditionally. However, only if "*valid != 0" we would
  execute the optimized version of the SCoP. Nevertheless, we would have
  hoisted and accessed "ptr"regardless of "*valid". This changes the
  semantic of the program as the value of "*valid" can cause a change of
  "ptr" and control if it is executed or not.

  To fix this problem we adjust the execution context of hoisted loads
  wrt. error domains. To this end we introduce an ErrorDomainCtxMap that
  maps each basic block to the error context under which it might be
  executed. Thus, to the context under which it is executed but an error
  block would have been executed to. To fill this map one traversal of
  the blocks in the SCoP suffices. During this traversal we do also
  "remove" error statements and those that are only reachable via error
  statements. This was previously done by the removeErrorBlockDomains
  function which is therefor not needed anymore.

  This fixes bug PR26683 and thereby several SPEC miscompiles.

Differential Revision: http://reviews.llvm.org/D18822

llvm-svn: 265778
2016-04-08 10:30:09 +00:00
Johannes Doerfert b47cbe1c72 [FIX] Handle multiplications in the SCEVAffinator again
If ScalarEvolution cannot look through some expression but we do, it
  might happen that a multiplication will arrive at the
  SCEVAffinator::visitMulExpr. While we could always try to improve the
  extractConstantFactor function we might still miss something, thus we
  reintroduce the code to generate multiplicative piecewise-affine
  functions as a fall-back.

llvm-svn: 265777
2016-04-08 10:27:40 +00:00
Johannes Doerfert 7b81103589 [FIX] Look through div & srem instructions in SCEVs
The findValues() function did not look through div & srem instructions
  that were part of the argument SCEV. However, in different other
  places we already look through it. This mismatch caused us to preload
  values in the wrong order.

llvm-svn: 265775
2016-04-08 10:25:58 +00:00
Johannes Doerfert a49c557f70 Remove dead code and comment [NFC]
llvm-svn: 265413
2016-04-05 16:18:53 +00:00
Johannes Doerfert 57c5f0b1c4 [FIX] Ensure SAI objects for exit PHIs
If all exiting blocks of a SCoP are error blocks and therefor not
  represented we will not generate accesses and consequently no SAI
  objects for exit PHIs. However, they are needed in the code generation
  to generate the merge PHIs between the original and optimized region.
  With this patch we enusre that the SAI objects for exit PHIs exist
  even if all exiting blocks turn out to be eror blocks.

  This fixes the crash reported in PR27207.

llvm-svn: 265393
2016-04-05 13:44:21 +00:00
Tobias Grosser 535afd808d ScopInfo: Check for possibly nested GEP in fixed-size delin
We currently only consider the first GEP when delinearizing access functions,
which makes us loose information about additional index expression offsets,
which results in our SCoP model to be incorrect. With this patch we now
compare the base pointers used to ensure we do not miss any additional offsets.
This fixes llvm.org/PR27195.

We may consider supporting nested GEP in our delinearization heuristics in
the future.

llvm-svn: 265379
2016-04-05 06:23:45 +00:00
Johannes Doerfert 1519491eaf Do not allow to complex branch conditions
Even before we build the domain the branch condition can become very
  complex, especially if we have to build the complement of a lot of
  equality constraints. With this patch we bail if the branch condition
  has a lot of basic sets and parameters.

  After this patch we now successfully compile
    External/SPEC/CINT2000/186_crafty/186_crafty
  with "-polly-process-unprofitable -polly-position=before-vectorizer".

llvm-svn: 265286
2016-04-04 07:59:41 +00:00
Johannes Doerfert 642594ae87 Exploit graph properties during domain generation
As a CFG is often structured we can simplify the steps performed during
  domain generation. When we push domain information we can utilize the
  information from a block A to build the domain of a block B, if A dominates B
  and there is no loop backede on a path from A to B. When we pull domain
  information we can use information from a block A to build the domain of a
  block B if B post-dominates A. This patch implements both ideas and thereby
  simplifies domains that were not simplified by isl. For the FINAL basic block
  in test/ScopInfo/complex-successor-structure-3.ll we used to build a universe
  set with 81 basic sets. Now it actually is represented as universe set.

  While the initial idea to utilize the graph structure depended on the
  dominator and post-dominator tree we can use the available region
  information as a coarse grained replacement. To this end we push the
  region entry domain to the region exit and pull it from the region
  entry for the region exit if applicable.

  With this patch we now successfully compile
    External/SPEC/CINT2006/400_perlbench/400_perlbench
  and
    SingleSource/Benchmarks/Adobe-C++/loop_unroll.

Differential Revision: http://reviews.llvm.org/D18450

llvm-svn: 265285
2016-04-04 07:57:39 +00:00
Johannes Doerfert a07f0ac73f Factor out "adjustDomainDimensions" function [NFC]
llvm-svn: 265284
2016-04-04 07:50:40 +00:00
Johannes Doerfert d5edbd61a1 [FIX] Do not create a SCoP in the presence of infinite loops
If a loop has no exiting blocks the region covering we use during
  schedule genertion might not cover that loop properly. For now we bail
  out as we would not optimize these loops anyway.

llvm-svn: 265280
2016-04-03 23:09:06 +00:00
Tobias Grosser 151ae32dba Revert "[FIX] Do not create a SCoP in the presence of infinite loops"
This reverts commit r265260, as it caused the following 'make check-polly'
failures:

    Polly :: ScopDetect/index_from_unpredictable_loop.ll
    Polly :: ScopInfo/multiple_exiting_blocks.ll
    Polly :: ScopInfo/multiple_exiting_blocks_two_loop.ll
    Polly :: ScopInfo/schedule-const-post-dominator-walk-2.ll
    Polly :: ScopInfo/schedule-const-post-dominator-walk.ll
    Polly :: ScopInfo/switch-5.ll

llvm-svn: 265272
2016-04-03 19:36:52 +00:00
Johannes Doerfert 2075b5d2a1 [FIX] Do not create two SAI objects for exit PHIs
If an exit PHI is written and also read in the SCoP we should not create two
  SAI objects but only one. As the read is only modeled to ensure OpenMP code
  generation knows about it we can simply use the EXIT_PHI MemoryKind for both
  accesses.

llvm-svn: 265261
2016-04-03 11:16:00 +00:00
Johannes Doerfert 7dcceb82e9 [FIX] Do not create a SCoP in the presence of infinite loops
If a loop has no exiting blocks the region covering we use during
  schedule genertion might not cover that loop properly. For now we bail
  out as we would not optimize these loops anyway.

llvm-svn: 265260
2016-04-03 11:12:39 +00:00
Johannes Doerfert 6ba927148d [FIX] Adjust the insert point for non-affine region PHIs
If a non-affine region PHI is generated we should not move the insert
  point prior to the synthezised value in the same block as we might
  split that block at the insert point later on. Only if the incoming
  value should be placed in a different block we should change the
  insertion point.

llvm-svn: 265132
2016-04-01 11:25:47 +00:00
Tobias Grosser db6db505c9 ScoPDetection: Obtain a known free diagnostic ID
... instead of hardcoding something that has been free at some point. This fixes
a crash triggered by r265084, where the diagnostic IDs have been shifted in a
way that resulted our hardcode ID to not be assigned any implementation.  Our ID
was likely already wrong earlier on, but this time we really crashed nicely.

llvm-svn: 265114
2016-04-01 07:15:19 +00:00
Tobias Grosser 6deba4ea03 Revert 264782 and 264789
These caused LNT failures due to new assertions when running with
-polly-position=before-vectorizer -polly-process-unprofitable for:

FAIL: clamscan.compile_time
FAIL: cjpeg.compile_time
FAIL: consumer-jpeg.compile_time
FAIL: shapes.compile_time
FAIL: clamscan.execution_time
FAIL: cjpeg.execution_time
FAIL: consumer-jpeg.execution_time
FAIL: shapes.execution_time

The failures have been introduced by r264782, but r264789 had to be reverted
as it depended on the earlier patch.

llvm-svn: 264885
2016-03-30 18:18:31 +00:00
Johannes Doerfert a144fb148b Exploit graph properties during domain generation
As a CFG is often structured we can simplify the steps performed
  during domain generation. When we push domain information we can
  utilize the information from a block A to build the domain of a
  block B, if A dominates B. When we pull domain information we can
  use information from a block A to build the domain of a block B
  if B post-dominates A. This patch implements both ideas and thereby
  simplifies domains that were not simplified by isl. For the FINAL
  basic block in
    test/ScopInfo/complex-successor-structure-3.ll .
  we used to build a universe set with 81 basic sets. Now it actually is
  represented as universe set.

  While the initial idea to utilize the graph structure depended on the
  dominator and post-dominator tree we can use the available region
  information as a coarse grained replacement. To this end we push the
  region entry domain to the region exit and pull it from the region
  entry for the region exit.

Differential Revision: http://reviews.llvm.org/D18450

llvm-svn: 264789
2016-03-29 21:31:05 +00:00
Johannes Doerfert e11e08bd1f Factor out "adjustDomainDimensions" function [NFC]
llvm-svn: 264782
2016-03-29 20:41:24 +00:00
Johannes Doerfert 29cb067000 Factor out "getFirstNonBoxedLoopFor" function [NFC]
llvm-svn: 264781
2016-03-29 20:32:43 +00:00
Johannes Doerfert 5fb9b21c24 Bail as early as possible
Instead of waiting for the domain construction to finish we will now
  bail as early as possible in case a complexity problem is encountered.
  This might save compile time but more importantly it makes the "abort"
  explicit. While we can always check if we invalidated the assumed
  context we can simply propagate the result of the construction back.
  This also removes the HasComplexCFG flag that was used for the very
  same reason.

Differential Revision: http://reviews.llvm.org/D18504

llvm-svn: 264775
2016-03-29 20:02:05 +00:00