Commit Graph

2221 Commits

Author SHA1 Message Date
Tobias Grosser 1a2e0e6415 Fix formatting in Polly
llvm-svn: 302620
2017-05-10 04:53:59 +00:00
Chandler Carruth d742e5efa8 Update Polly for LLVM API change r302571 that removed varargs functions
with a nullptr sentinel in favor of nicely typed variadic templates.

llvm-svn: 302618
2017-05-10 02:39:35 +00:00
Siddharth Bhat a90be207c6 [Polly][PPCGCodeGen] OpenCL now gets kernel argument size from PPCG CodeGen
Summary: PPCGCodeGeneration now attaches the size of the kernel launch parameters at the end of the parameter list. For the existing CUDA Runtime, this gets ignored, but the OpenCL Runtime knows to check for kernel-argument size at the end of the parameter list. (The resulting parameters list is twice as long. This has been accounted for in the corresponding test cases).

Reviewers: grosser, Meinersbur, bollu

Reviewed By: bollu

Subscribers: nemanjai, yaxunl, Anastasia, pollydev, llvm-commits

Tags: #polly

Differential Revision: https://reviews.llvm.org/D32961

llvm-svn: 302515
2017-05-09 10:45:52 +00:00
Siddharth Bhat 17f01968f1 [Polly] Added OpenCL Runtime to GPURuntime Library for GPGPU CodeGen
Summary:
When compiling for GPU, one can now choose to compile for OpenCL or CUDA,
with the corresponding polly-gpu-runtime flag (libopencl / libcudart). The
GPURuntime library (GPUJIT) has been extended with the OpenCL Runtime library
for that purpose, correctly choosing the corresponding library calls to the
option chosen when compiling (via different initialization calls).

Additionally, a specific GPU Target architecture can now be chosen with -polly-gpu-arch (only nvptx64 implemented thus far).

Reviewers: grosser, bollu, Meinersbur, etherzhhb, singam-sanjay

Reviewed By: grosser, Meinersbur

Subscribers: singam-sanjay, llvm-commits, pollydev, nemanjai, mgorny, yaxunl, Anastasia

Tags: #polly

Differential Revision: https://reviews.llvm.org/D32431

llvm-svn: 302379
2017-05-07 21:03:46 +00:00
Michael Kruse 5ae08c0ebb [DeLICM] Known knowledge.
Extend the Knowledge class to store information about the contents
of array elements and which values are written. Two knowledges do
not conflict the known content is the same. The content information
if computed from writes to and loads from the array elements, and
represented by "ValInst": isl spaces that compare equal if the value
represented is the same.

Differential Revision: https://reviews.llvm.org/D31247

llvm-svn: 302339
2017-05-06 14:03:58 +00:00
Michael Kruse 2a8f6f843f [CMake] Introduce POLLY_BUNDLED_JSONCPP.
Allow using a system's install jsoncpp library instead of the bundled
one with the setting POLLY_BUNDLED_JSONCPP=OFF.

This fixes llvm.org/PR32929

Differential Revision: https://reviews.llvm.org/D32922

llvm-svn: 302336
2017-05-06 13:42:15 +00:00
Michael Kruse 391a2ac09b [ScopBuilder] Move Scop::init to ScopBuilder. NFC.
Scop::init is used only during SCoP construction. Therefore ScopBuilder
seems the more appropriate place for it. We integrate it onto its only
caller ScopBuilder::buildScop where some other construction steps
already took place.

Differential Revision: https://reviews.llvm.org/D32908

llvm-svn: 302276
2017-05-05 20:09:08 +00:00
Michael Kruse f1052ceb5e [ScopBuilder] Do not verify unfeasible SCoPs.
SCoPs with unfeasible runtime context are thrown away and therefore
do not need their uses verified.

The added test case requires a complexity limit to exceed.
Normally, error statements are removed from the SCoP and for that
reason are skipped during the verification. If there is a unfeasible
runtime context (here: because of the complexity limit being reached),
the removal of error statements and other SCoP construction steps are
skipped to not waste time. Error statements are not modeled in SCoPs
and therefore have no requirements on whether the scalars used in
them are available.

llvm-svn: 302234
2017-05-05 13:38:35 +00:00
Tobias Grosser d5727c5011 Fix handling of signWrappedSets in access relations
Since r294891, in MemoryAccess::computeBoundsOnAccessRelation(), we skip
manually bounding the access relation in case the parameter of the load
instruction is already a wrapped set. Later on we assume that the lower
bound on the set is always smaller or equal to the upper bound on the
set. Bug 32715 manages to construct a sign wrapped set, in which case
the assertion does not necessarily hold. Fix this by handling a sign
wrapped set similar to a normal wrapped set, that is skipping the
computation.

Contributed-by: Maximilian Falkenstein <falkensm@student.ethz.ch>

Reviewers: grosser

Subscribers: pollydev, llvm-commits

Tags: #Polly

Differential Revision: https://reviews.llvm.org/D32893

llvm-svn: 302231
2017-05-05 13:20:47 +00:00
Siddharth Bhat c1267b9baa Revert "[Polly] Added OpenCL Runtime to GPURuntime Library for GPGPU CodeGen"
This reverts commit 17a84e414adb51ee375d14836d4c2a817b191933.

Patches should have been submitted in the order of:

1. D32852
2. D32854
3. D32431

I mistakenly pushed D32431(3) first. Reverting to push in the correct
order.

llvm-svn: 302217
2017-05-05 09:02:08 +00:00
Siddharth Bhat 51904ae35a [Polly] Added OpenCL Runtime to GPURuntime Library for GPGPU CodeGen
Summary:
When compiling for GPU, one can now choose to compile for OpenCL or CUDA,
with the corresponding polly-gpu-runtime flag (libopencl / libcudart). The
GPURuntime library (GPUJIT) has been extended with the OpenCL Runtime library
for that purpose, correctly choosing the corresponding library calls to the
option chosen when compiling (via different initialization calls).

Additionally, a specific GPU Target architecture can now be chosen with -polly-gpu-arch (only nvptx64 implemented thus far).

Reviewers: grosser, bollu, Meinersbur, etherzhhb, singam-sanjay

Reviewed By: grosser, Meinersbur

Subscribers: singam-sanjay, llvm-commits, pollydev, nemanjai, mgorny, yaxunl, Anastasia

Tags: #polly

Differential Revision: https://reviews.llvm.org/D32431

llvm-svn: 302215
2017-05-05 07:54:49 +00:00
Michael Kruse 704c03e03b [ScopBuilder] Add missing semicolon after LLVM_FALLTHROUGH.
It was forgotten in r302157.

llvm-svn: 302163
2017-05-04 15:55:54 +00:00
Michael Kruse eedae7630a Introduce VirtualUse. NFC.
If a ScopStmt references a (scalar) value, there are multiple
possibilities where this value can come. The decision about what kind of
use it is must be handled consistently at different places, which can be
error-prone. VirtualUse is meant to centralize the handling of the
different types of value uses.

This patch makes ScopBuilder and CodeGeneration use VirtualUse. This
already helps to show inconsistencies with the value handling. In order
to keep this patch NFC, exceptions to the general rules are added.
These might be fixed later if they turn to problems. Overall, this
should result in fewer post-codegen IR-verification errors, but instead
assertion failures in `getNewValue` that are closer to the actual error.

Differential Revision: https://reviews.llvm.org/D32667

llvm-svn: 302157
2017-05-04 15:22:57 +00:00
Tobias Grosser 3f25a7e8ee [ScopDetection] Check for already known required-invariant loads [NFC]
For certain test cases we spent over 50% of the scop detection time in
checking if a load is likely invariant. We can avoid most of these checks by
testing early on if a load is expected to be invariant. Doing this reduces
scop-detection time on a large benchmark from 52 seconds to just 25 seconds.

No functional change is expected.

llvm-svn: 302134
2017-05-04 10:16:20 +00:00
Tobias Grosser e2ccc3fb33 [ScopInfo] Do not use LLVM names to identify statements, arrays, and parameters
LLVM-IR names are commonly available in debug builds, but often not in release
builds. Hence, using LLVM-IR names to identify statements or memory reference
results makes the behavior of Polly depend on the compile mode. This is
undesirable. Hence, we now just number the statements instead of using LLVM-IR
names to identify them (this issue has previously been brought up by Zino
Benaissa).

However, as LLVM-IR names help in making test cases more readable, we add an
option '-polly-use-llvm-names' to still use LLVM-IR names. This flag is by
default set in the polly tests to make test cases more readable.

This change reduces the time in ScopInfo from 32 seconds to 2 seconds for the
following test case provided by Eli Friedman <efriedma@codeaurora.org> (already
used in one of the previous commits):

  struct X { int x; };
  void a();
  #define SIG (int x, X **y, X **z)
  typedef void (*fn)SIG;
  #define FN { for (int i = 0; i < x; ++i) { (*y)[i].x += (*z)[i].x; } a(); }
  #define FN5 FN FN FN FN FN
  #define FN25 FN5 FN5 FN5 FN5
  #define FN125 FN25 FN25 FN25 FN25 FN25
  #define FN250 FN125 FN125
  #define FN1250 FN250 FN250 FN250 FN250 FN250
  void x SIG { FN1250 }

For a larger benchmark I have on-hand (10000 loops), this reduces the time for
running -polly-scops from 5 minutes to 4 minutes, a reduction by 20%.

The reason for this large speedup is that our previous use of printAsOperand
had a quadratic cost, as for each printed and unnamed operand the full function
was scanned to find the instruction number that identifies the operand.

We do not need to adjust the way memory reference ids are constructured, as
they do not use LLVM values.

Reviewed by: efriedma

Tags: #polly

Differential Revision: https://reviews.llvm.org/D32789

llvm-svn: 302072
2017-05-03 20:08:52 +00:00
Tobias Grosser 72684bbaf5 [ScopInfo] Remove code not needed anymore after r302004
llvm-svn: 302005
2017-05-03 08:02:32 +00:00
Tobias Grosser 8133128c17 [ScopInfo] Do not add array name into memory reference ids
Before this change a memory reference identifier had the form:

  <STMT>_<ACCESSTYPE><ID>_<MEMREF>, e.g., Stmt_bb9_Write0_MemRef_tmp11

After this change, we use the format:

  <STMT>_<ACCESSTYPE><ID>, e.g., Stmt_bb9_Write0

The name of the array that is accessed through a memory reference is not
necessary to uniquely identify a memory reference, but was only added to
provide additional information for debugging. We drop this information now
for the following two reasons:

  1) This shortens the names and consequently improves readability
  2) This removes a second location where we decide on the name of a scop array,
     leaving us only with the location where the actual scop array is created.

Having after 2) only a single location to name scop arrays will allow us to
change the naming convention of scop arrays more easily, which we will do
in a future commit to reduce compilation time.

llvm-svn: 302004
2017-05-03 07:57:35 +00:00
Siddharth Bhat 6c3d19ba45 [NFC] [IslAST] fix typo: "int the" -> "in the"
llvm-svn: 301925
2017-05-02 14:54:49 +00:00
Michael Kruse ecbd57e98a [CMake] Move PollyCore to Polly project folder.
This keeps the artifacts consistently structured in the "Polly"
folder of Visual Studio solutions.

llvm-svn: 301779
2017-04-30 21:07:05 +00:00
Hongbin Zheng e9a9932712 [Polly] Make PollyCore depends on intrinsics_gen
llvm-svn: 301734
2017-04-29 03:12:17 +00:00
Tobias Grosser f13722177b [Codegen] Disable Polly's codegen verification by default
As has been reported in the previous commit, codegen verification can result in
quadratic compile time increases for large functions with many scops. This is
certainly not something we would like to have in the Polly default
configuration. Hence, we disable codegen verification by default -- also to see
if this resolves some of the compilation timeouts we currently see on the AOSP
buildbots. We still leave this feature in Polly as it has shown _very_ useful
for debugging. In fact, we may want to have a discussion if we can bring this
feature back in a way that does not impact compilation time so much.

Thanks to Eli Friedman <efriedma@codeaurora.org> for reporting this issue and
for providing the test case in the previous commit (where I forgot to
acknowledge him).

llvm-svn: 301670
2017-04-28 19:15:28 +00:00
Tobias Grosser d439911f73 [CodeGen] Skip verify if -polly-codegen-verify is set to false
Before this change, we always tried to verify the function and printed
verification errors, but just did not abort in case -polly-codegen-verify=false
was set and verification failed. As verification can become very cosly -- for
large functions with many scops we may verify the very same function very often
-- this can affect compile time very negatively. Hence, we respect the
-polly-codegen-verify flag with this check, ensuring that no verification is run
if -polly-codegen-verify=false.

This reduces code generation time from 26 seconds to 4 seconds on the test
case below with -polly-codegen-verify=false:

  struct X { int x; };
  void a();
  #define SIG (int x, X **y, X **z)
  typedef void (*fn)SIG;
  #define FN { for (int i = 0; i < x; ++i) { (*y)[i].x += (*z)[i].x; } a(); }
  #define FN5 FN FN FN FN FN
  #define FN25 FN5 FN5 FN5 FN5
  #define FN125 FN25 FN25 FN25 FN25 FN25
  #define FN250 FN125 FN125
  #define FN1250 FN250 FN250 FN250 FN250 FN250
  void x SIG { FN1250 }

llvm-svn: 301669
2017-04-28 19:08:20 +00:00
Siddharth Bhat abed49699b [Polly] [PPCGCodeGeneration] Add managed memory support to GPU code
generation.

This needs changes to GPURuntime to expose synchronization between host
and device.

1. Needs better function naming, I want a better name than
"getOrCreateManagedDeviceArray"

2. DeviceAllocations is used by both the managed memory and the
non-managed memory path. This exploits the fact that the two code paths
are never run together. I'm not sure if this is the best design decision

Reviewed by: PhilippSchaad

Tags: #polly

Differential Revision: https://reviews.llvm.org/D32215

llvm-svn: 301640
2017-04-28 11:16:30 +00:00
Tobias Grosser 287942ae82 Update to isl-0.18-592-gb50ad59
This is just a general maintenance update.

llvm-svn: 301624
2017-04-28 06:11:17 +00:00
Tobias Grosser c96c1d8c87 [ScopInfo] Consider only write-free dereferencable loads as invariant
When we introduced in r297375 support for hoisting loads that are known
to be dereferencable without any conditional guard, we forgot to keep the check
to verify that no other write into the very same location exists. This
change ensures now that dereferencable loads are allowed to access everything,
but can only be hoisted in case no conflicting write exists.

This resolves llvm.org/PR32778

Reported-by: Huihui Zhang <huihuiz@codeaurora.org>
llvm-svn: 301582
2017-04-27 20:08:16 +00:00
Michael Kruse 792a6fcc57 [CMake] Use object library to build the two flavours of Polly.
Polly comes in two library flavors: One loadable module to use the
LLVM framework -load mechanism, and another one that host applications
can link to. These have very different requirements for Polly's
own dependencies.

The loadable module assumes that all its LLVM dependencies are already
available in the address space of the host application, and is not allowed
to bring in its own copy of any LLVM library (including the NVPTX
backend in case of Polly-ACC).

The non-module library is intended to be linked to using
target_link_libraries. CMake would then resolve all of its dependencies,
including NVPTX and ensure that only a single instance of each library
will be used.

Differential Revision: https://reviews.llvm.org/D32442

llvm-svn: 301558
2017-04-27 16:13:03 +00:00
Hongbin Zheng 0f8f177682 [Polly] Do not introduce address space cast
Do not introduce address space cast in IslNodeBuilder::preloadUnconditionally.

Differential Revision: https://reviews.llvm.org/D32581

llvm-svn: 301519
2017-04-27 06:42:14 +00:00
Michael Kruse 3e519b949b [DeLICM] Use Known information when comparing Occupied and Written.
Do not conflict if a write writes the same value as already known.

This change only affects unit tests, but no functional changes are
expected on LLVM-IR, as no Known information is yet extracted and
consequently this functionality is only triggered through unit tests.

Differential Revision: https://reviews.llvm.org/D32026

llvm-svn: 301460
2017-04-26 20:35:07 +00:00
Tobias Grosser 1c3eebac08 Update to isl-0.18-423-g30331fe
This is just a general maintenance update.

llvm-svn: 301433
2017-04-26 17:08:02 +00:00
Michael Kruse cd2be66bf0 [DeLICM] Use Known information when comparing Existing.Occupied and Proposed.Occupied.
Do not conflict if the value of Existing and Proposed are the same.

This change only affects unit tests, but no functional changes are
expected on LLVM-IR, as no Known information is yet extracted and
consequently this functionality is only triggered through unit tests.

Differential Revision: https://reviews.llvm.org/D32025

llvm-svn: 301301
2017-04-25 10:57:32 +00:00
Siddharth Bhat d277feda91 [PPCGCodeGeneration] Update PPCG Code Generation for OpenCL compatibility
Added a small change to the way pointer arguments are set in the kernel
code generation. The way the pointer is retrieved now, specifically requests
global address space to be annotated. This is necessary, if the IR should be
run through NVPTX to generate OpenCL compatible PTX.

The changes do not affect the PTX Strings generated for the CUDA target
(nvptx64-nvidia-cuda), but are necessary for OpenCL (nvptx64-nvidia-nvcl).

Additionally, the data layout has been updated to what the NVPTX Backend requests/recommends.

Contributed-by: Philipp Schaad

Reviewers: Meinersbur, grosser, bollu

Reviewed By: grosser, bollu

Subscribers: jlebar, pollydev, llvm-commits, nemanjai, yaxunl, Anastasia

Tags: #polly

Differential Revision: https://reviews.llvm.org/D32215

llvm-svn: 301299
2017-04-25 08:08:29 +00:00
Siddharth Bhat 729377f063 [Polly] [DependenceInfo] change WAR generation, Read will not block Read
Earlier, the call to buildFlow was:
    WAR = buildFlow(Write, Read, MustWrite, Schedule).

This meant that Read could block another Read, since must-sources can
block each other.

Fixed the call to buildFlow to correctly compute Read. The resulting
code needs to do some ISL juggling to get the output we want.

Bug report: https://bugs.llvm.org/show_bug.cgi?id=32623

Reviewers: Meinersbur

Tags: #polly

Differential Revision: https://reviews.llvm.org/D32011

llvm-svn: 301266
2017-04-24 22:23:12 +00:00
Tobias Grosser 9b34a08b19 [isl C++ bindings] Add explicit const casts for *foreach* bindings
This avoids a compiler warning about lost 'const' attributes.

Suggested-by: Michael Kruse <llvm@meinersbur.de>
llvm-svn: 301108
2017-04-23 07:54:12 +00:00
Michael Kruse 8431e996d3 [DeLICM] Use Known information when comparing Existing.Written and Proposed.Written.
This change only affects unit tests, but no functional changes are
expected on LLVM-IR, as no Known information is yet extracted and
consequently this functionality is only triggered through unit tests.

Differential Revision: https://reviews.llvm.org/D32027

llvm-svn: 300874
2017-04-20 19:16:39 +00:00
Tobias Grosser 1f8b84094f Update isl bindings to latest version (+ Polly extensions)
After the isl C++ binding generator is now close to being upstreamed to isl, we
synchronize the latest changes to Polly. These are mostly formatting changes
plus a small interface change for the foreach callback function and some naming
changes in isl::boolean.

llvm-svn: 300398
2017-04-15 08:15:54 +00:00
Tobias Grosser 75aa1a9a49 Use isl C++ foreach implementation
This commit switches Polly over to the isl::obj::foreach_* implementation, which
is part of the new isl bindings and follows the foreach pattern established in
Polly by Michael Kruse.

The original isl C function:

  isl_stat isl_union_set_foreach_set(__isl_keep isl_union_set *uset,
      isl_stat (*fn)(__isl_take isl_set *set, void *user), void *user);

which required the user to define a static callback function to which all
interesting parameters are passed via a 'void *' user-pointer, is on the
C++ side available as a function that takes a std::function<>, which can
carry any additional arguments without the need for a user pointer:

  stat UnionSet::foreach_set(const std::function<stat(set)> &fn) const;

The following code illustrates the use of the new C++ interface:

  auto Lambda = [=, &Result](isl::set Set) -> isl::stat {
    auto Shifted = shiftDimension(Set, Pos, Amount);
    Result = Result.add(Shifted);
    return isl::stat::ok;
  }

  UnionSet.foreach_set(Lambda);

Polly had some specialized foreach functions which did not require the lambdas
to return a status flag. We remove these functions in this commit to move Polly
completely over to the new isl interface. We may in the future discuss if
functors without return values can be supported easily.

Another extension proposed by Michael Kruse is the use of C++ iterators to allow
the use of normal for loops to iterate over these sets. Such an extension would
allow us to further simplify the code.

Reviewed-by: Michael Kruse <llvm@meinersbur.de>

Differential Revision: https://reviews.llvm.org/D30620

llvm-svn: 300323
2017-04-14 13:39:40 +00:00
Michael Kruse 72f3922534 [DeLICM] Export Known and Written to DeLICMTests. NFC.
This will allow unittesting of new functionality based on
Known and Written.

llvm-svn: 300211
2017-04-13 16:32:39 +00:00
Michael Kruse a2acc11949 [DeLICM] Add Knowledge::Known. NFC.
This field will later contain a ValInst that is known to be stored
in an occupied array element.

llvm-svn: 300210
2017-04-13 16:32:31 +00:00
Michael Kruse fa7c8cdfc6 [DeLICM] Make Knowledge::Written an isl::union_map. NFC.
The map will later point to a ValInst that is written.

llvm-svn: 300208
2017-04-13 16:32:25 +00:00
Tobias Grosser 7b5a4dfd46 Exploit BasicBlock::getModule to shorten code
Suggested-by: Roman Gareev <gareevroman@gmail.com>
llvm-svn: 299914
2017-04-11 04:59:13 +00:00
Tobias Grosser 67726b3260 SAdjust to recent change in constructor definition of AllocaInst
llvm-svn: 299913
2017-04-11 04:23:38 +00:00
Matt Arsenault b3e30c32ce Update for alloca construction changes
llvm-svn: 299905
2017-04-11 00:12:58 +00:00
Roman Gareev 9d4d91ca6a [FIX] Fix ScheduleTreeOptimizer::optimizeMatMulPattern
Use new values of the dimensions during their permutation.

llvm-svn: 299663
2017-04-06 17:25:08 +00:00
Roman Gareev e0d466342b Restore the initial ordering of dimensions before applying the pattern matching
Dimensions of band nodes can be implicitly permuted by the algorithm applied
during the schedule generation.

For example, in case of the following matrix-matrix multiplication,

for (i = 0; i < 1024; i++)
  for (k = 0; k < 1024; k++)
    for (j = 0; j < 1024; j++)
      C[i][j] += A[i][k] * B[k][j];

it can produce the following schedule tree

domain: "{ Stmt_for_body6[i0, i1, i2] : 0 <= i0 <= 1023 and 0 <= i1 <= 1023 and
                                        0 <= i2 <= 1023 }"
child:
  schedule: "[{ Stmt_for_body6[i0, i1, i2] -> [(i0)] },
              { Stmt_for_body6[i0, i1, i2] -> [(i1)] },
              { Stmt_for_body6[i0, i1, i2] -> [(i2)] }]"
  permutable: 1
  coincident: [ 1, 1, 0 ]

The current implementation of the pattern matching optimizations relies on the
initial ordering of dimensions. Otherwise, it can produce the miscompilation
(e.g., [1]).

This patch helps to restore the initial ordering of dimensions by recreating
the band node when the corresponding conditions are satisfied.

Refs.:

[1] - https://bugs.llvm.org/show_bug.cgi?id=32500

Reviewed-by: Michael Kruse <llvm@meinersbur.de>

Differential Revision: https://reviews.llvm.org/D31741

llvm-svn: 299662
2017-04-06 17:09:54 +00:00
Siddharth Bhat 5eeb1dd42e [Polly] [ScheduleOptimizer] Prevent incorrect tile size computation
Because Polly exposes parameters that directly influence tile size
calculations, one can setup situations like divide-by-zero.

Check against a possible divide-by-zero in getMacroKernelParams
and return early.

Also assert at the end of getMacroKernelParams that the block sizes
computed for matrices are positive (>= 1).

Tags: #polly

Differential Revision: https://reviews.llvm.org/D31708

llvm-svn: 299633
2017-04-06 08:20:22 +00:00
Tobias Grosser 0d622a4bf9 Update to isl-0.18-417-gb9e7334
This is a regular maintenance update.

llvm-svn: 299617
2017-04-06 03:41:47 +00:00
Michael Kruse 895f5d8080 Remove llvm.lifetime.start/end in original region.
The current StackColoring algorithm does not correctly handle the
situation when some, but not all paths from a BB to the entry node
cross a llvm.lifetime.start. According to an interpretation of the
language reference at
http://llvm.org/docs/LangRef.html#llvm-lifetime-start-intrinsic
this might be correct, but it would cost too much effort to handle
in StackColoring.

To be on the safe side, remove all lifetime markers even in the original
code version (they have never been copied to the optimized version)
to ensure that no path to the entry block will cross a
llvm.lifetime.start.

The same principle applies to paths the a function return and the
llvm.lifetime.end marker, so we remove them as well.

This fixes llvm.org/PR32251.

Also see the discussion at
http://lists.llvm.org/pipermail/llvm-dev/2017-March/111551.html

llvm-svn: 299585
2017-04-05 20:09:59 +00:00
Siddharth Bhat bcbfdade41 [Polly] [DependenceInfo] change WAR, WAW generation to correct semantics
= Change of WAR, WAW generation: =

- `buildFlow(Sink, MustSource, MaySource, Sink)` treates any flow of the form
    `sink <- may source <- must source` as a *may* dependence.

- we used to call:
```lang=cpp, name=old-flow-call.cpp
Flow = buildFlow(MustWrite, MustWrite, Read, Schedule);
WAW = isl_union_flow_get_must_dependence(Flow);
WAR = isl_union_flow_get_may_dependence(Flow);
```

- This caused some WAW dependences to be treated as WAR dependences.
- Incorrect semantics.

- Now, we call WAR and WAW correctly.

== Correct WAW: ==
```lang=cpp, name=new-waw-call.cpp
   Flow = buildFlow(Write, MustWrite, MayWrite, Schedule);
   WAW = isl_union_flow_get_may_dependence(Flow);
   isl_union_flow_free(Flow);
```

== Correct WAR: ==
```lang=cpp, name=new-war-call.cpp
    Flow = buildFlow(Write, Read, MustaWrite, Schedule);
    WAR = isl_union_flow_get_must_dependence(Flow);
    isl_union_flow_free(Flow);
```

- We want the "shortest" WAR possible (exact dependences).
- We mark all the *must-writes* as may-source, reads as must-souce.
- Then, we ask for *must* dependence.
- This removes all the reads that flow through a *must-write*
  before reaching a sink.
- Note that we only block ealier writes with *must-writes*. This is
  intuitively correct, as we do not want may-writes to block
  must-writes.
- Leaves us with direct (R -> W).

- This affects reduction generation since RED is built using WAW and WAR.

= New StrictWAW for Reductions: =

- We used to call:
```lang=cpp,name=old-waw-war-call.cpp
      Flow = buildFlow(MustWrite, MustWrite, Read, Schedule);
      WAW = isl_union_flow_get_must_dependence(Flow);
      WAR = isl_union_flow_get_may_dependence(Flow);
```

- This *is* the right model of WAW we need for reductions, just not in general.
- Reductions need to track only *strict* WAW, without any interfering reductions.

= Explanation: Why the new WAR dependences in tests are correct: =

- We no longer set WAR = WAR - WAW
- Hence, we will have WAR dependences that were originally removed.
- These may look incorrect, but in fact make sense.

== Code: ==
```lang=llvm, name=new-war-dependence.ll
  ;    void manyreductions(long *A) {
  ;      for (long i = 0; i < 1024; i++)
  ;        for (long j = 0; j < 1024; j++)
  ; S0:          *A += 42;
  ;
  ;      for (long i = 0; i < 1024; i++)
  ;        for (long j = 0; j < 1024; j++)
  ; S1:          *A += 42;
  ;
```
=== WAR dependence: ===
  {  S0[1023, 1023] -> S1[0, 0] }

- Between `S0[1023, 1023]` and `S1[0, 0]`, we will have the dependences:

```lang=cpp, name=dependence-incorrect, counterexample
        S0[1023, 1023]:
    *-- tmp = *A (load0)--*
WAR 2   add = tmp + 42    |
    *-> *A = add (store0) |
                         WAR 1
        S1[0, 0]:         |
        tmp = *A (load1)  |
        add = tmp + 42    |
        A = add (store1)<-*
```

- One may assume that WAR2 *hides* WAR1 (since store0 happens before
  store1). However, within a statement, Polly has no idea about the
  ordering of loads and stores.

- Hence, according to Polly, the code may have looked like this:
```lang=cpp, name=dependence-correct
    S0[1023, 1023]:
    A = add (store0)
    tmp = A (load0) ---*
    add = A + 42       |
                     WAR 1
    S1[0, 0]:          |
    tmp = A (load1)    |
    add = A + 42       |
    A = add (store1) <-*
```

- So, Polly  generates (correct) WAR dependences. It does not make sense
  to remove these dependences, since they are correct with respect to
  Polly's model.

    Reviewers: grosser, Meinersbur

    tags: #polly

    Differential revision: https://reviews.llvm.org/D31386

llvm-svn: 299429
2017-04-04 13:08:23 +00:00
Philip Pfaffe 447f175eb5 Fix formatting in LoopGenerators
llvm-svn: 299424
2017-04-04 10:22:17 +00:00
Philip Pfaffe 2d950f36ee [Polly][NewPM] Pull references to the legacy PM interface from utilities and helpers
Summary:
A couple of the utilities used to analyze or build IR make explicit use of the legacy PM on their interface, to access analysis results. This patch removes the legacy PM from the interface, and just passes the required results directly.

This shouldn't introduce any function changes, although the API technically allowed to obtain two different analysis results before, one passed by reference and one through the PM. I don't believe that was ever intended, however.

Reviewers: grosser, Meinersbur

Reviewed By: grosser

Subscribers: nemanjai, pollydev, llvm-commits

Tags: #polly

Differential Revision: https://reviews.llvm.org/D31653

llvm-svn: 299423
2017-04-04 10:01:53 +00:00