Commit Graph

3192 Commits

Author SHA1 Message Date
Siddharth Bhat 68ae83e68c [Docs] Use ReadTheDocs theme if available.
Use ReadTheDocs theme for Sphinx if available since it is well
maintained and used by readthedocs.org.

Differential Revision: https://reviews.llvm.org/D33387

llvm-svn: 303550
2017-05-22 13:36:15 +00:00
Siddharth Bhat b2f754e39f [Docs] Fix Sphinx documentation in CMake check.
Summary:
- `include(AddSphinxTarget)` needs to occur before checking `SPHINX_FOUND`.
- `docs-polly-html` and `docs-polly-man` are now usable again.
- Perhaps we should build docs in the CI as well?

Differential Revision: https://reviews.llvm.org/D33386

llvm-svn: 303549
2017-05-22 13:16:02 +00:00
Michael Kruse 706f79ab14 [CodeGen] Support partial write accesses.
Allow the BlockGenerator to generate memory writes that are not defined
over the complete statement domain, but only over a subset of it. It
generates a condition that evaluates to 1 if executing the subdomain,
and only then execute the access.

Only write accesses are supported. Read accesses would require a PHINode
which has a value if the access is not executed.

Partial write makes DeLICM able to apply mappings that are not defined
over the entire domain (for instance, a branch that leaves a loop with
a PHINode in its header; a MemoryKind::PHI write when leaving is never
read by its PHI read).

Differential Revision: https://reviews.llvm.org/D33255

llvm-svn: 303517
2017-05-21 22:46:57 +00:00
Tobias Grosser 7be8245a40 [ScopInfo] Translate updateDimensionality to isl C++ [NFC]
llvm-svn: 303514
2017-05-21 20:38:33 +00:00
Tobias Grosser a3f7546931 [isl++] add isl_constraint to C++ bindings [NFC]
llvm-svn: 303512
2017-05-21 20:23:26 +00:00
Tobias Grosser 3137f2cb65 [ScopInfo] Translate wrapConstantDimensions to isl C++ [NFC]
llvm-svn: 303511
2017-05-21 20:23:23 +00:00
Tobias Grosser 99ea1d0808 [ScopInfo] Translate addRangeBoundsToSet to isl C++ [NFC]
llvm-svn: 303510
2017-05-21 20:23:20 +00:00
Tobias Grosser 1f94dcee0b Fix include order to stop clang-format complains
llvm-svn: 303509
2017-05-21 16:34:09 +00:00
Tobias Grosser 7205f93a98 [ScheduleOptimizer] Move schedule construction to isl C++ [NFC]
llvm-svn: 303508
2017-05-21 16:21:33 +00:00
Tobias Grosser b5f61bdeeb [Simplify] Move to isl C++
llvm-svn: 303507
2017-05-21 16:12:21 +00:00
Tobias Grosser 6151654c00 [isl++] Export (almost) all functions from isl
This commit exports the majority of the isl functions to the isl C++ interface.

The official isl C++ bindings still require discussions to define the set of
functions that are officially supported. As a result, the officially exported
functionality will be rather limited until these discussions conclude and a
non-trivial set of isl functions is officially supported through the isl C++
bindings. Starting from this commit we ship with Polly an extended version of
the official isl C++ bindings to ensure sufficient functionality is available
such that LLVM developers can make efficient use of isl through C++. The
practical experience Polly gathers with its bindings will then be used to
gradually upstream patches to isl to extend the official bindings.

llvm-svn: 303506
2017-05-21 16:00:32 +00:00
Tobias Grosser 443f6814a1 [isl++] Rebase isl C++ bindings on top of 29aee98ce
This reduces the diff to the official isl C++ bindings and solves a correctness
issue with isl::booleans, where isl_bool_error results were accidentally
converted to isl::boolean::true.

llvm-svn: 303505
2017-05-21 15:59:15 +00:00
Tobias Grosser 3320485961 [isl++] Move isl raw_ostream printers into separate header
Instead of relying on these functions to be part of the isl C++ bindings, we
just define this functionality independently. This allows us to use isl C++
bindings that do not contain LLVM specific functionality.

llvm-svn: 303503
2017-05-21 13:16:05 +00:00
Tobias Grosser ee61ebb134 Fix buildbots after r303429
A test case with a GPU runline was added without setting 'REQUIRES=pollyacc'. We
drop the GPU run line, as the basic functionality can already be tested with
the normal code generation.

llvm-svn: 303485
2017-05-20 04:22:26 +00:00
Siddharth Bhat b7f68b8c9e [Fortran Support] Materialize outermost dimension for Fortran array.
- We use the outermost dimension of arrays since we need this
information to generate GPU transfers.

- In general, if we do not know the outermost dimension of the array
(because the indexing expression is non-affine, for example) then we
simply cannot generate transfer code.

- However, for Fortran arrays, we can use the Fortran array
representation which stores the dimensions of all arrays.

- This patch uses the Fortran array representation to generate code that
computes the outermost dimension size.

Differential Revision: https://reviews.llvm.org/D32967

llvm-svn: 303429
2017-05-19 15:07:45 +00:00
Tobias Grosser d8945baa0a [ScopDetection] Allow detection of full functions
This is useful when only analyzing functions.

llvm-svn: 303420
2017-05-19 12:13:02 +00:00
Tobias Grosser 977158488e [ScopInfo] Fix typo in documentation
llvm-svn: 303405
2017-05-19 04:01:52 +00:00
Tobias Grosser 45e9fd1810 [ScopInfo] Gracefully handle long compile times
The following test case tried to compute the lexicographic minimum of the
following set during alias analysis, which caused very long compile time:

[p_0, p_1, p_2, p_3, p_4, p_5] -> { MemRef0[i0] : (517p_3 >= 70944 - 298p_2 and
256i0 >= -71199 + 298p_2 + 517p_3 and 256i0 <= -70944 + 298p_2 + 517p_3) or
(409p_4 >= 57120 - 298p_2 and 256i0 >= -57375 + 298p_2 + 409p_4 and 256i0 <=
-57120 + 298p_2 + 409p_4) or (104p_4 >= 17329 + 149p_2 - 50p_3 and 128i0 >=
17328 + 149p_2 - 50p_3 - 104p_4 and 128i0 <= 17455 + 149p_2 - 50p_3 - 104p_4) or
(104p_4 <= 17328 + 149p_2 - 50p_3 and 128i0 >= 17201 + 149p_2 - 50p_3 - 104p_4
and 128i0 <= 17328 + 149p_2 - 50p_3 - 104p_4) or (409p_4 <= 57119 - 298p_2 and
256i0 >= -57120 + 298p_2 + 409p_4 and 256i0 <= -56865 + 298p_2 + 409p_4) or
(517p_3 <= 70943 - 298p_2 and 256i0 >= -70944 + 298p_2 + 517p_3 and 256i0 <=
-70689 + 298p_2 + 517p_3) or (p_1 >= 2 + 2p_0 and 298p_5 >= 70944 - 517p_3 and
256i0 >= -71199 + 517p_3 + 298p_5 and 256i0 <= -70944 + 517p_3 + 298p_5) or (p_1
>= 2 + 2p_0 and 298p_5 >= 57120 - 409p_4 and 256i0 >= -57375 + 409p_4 + 298p_5
>and 256i0 <= -57120 + 409p_4 + 298p_5) or (p_1 >= 2 + 2p_0 and 149p_5 <= -17329
>+ 50p_3 + 104p_4 and 128i0 >= 17328 - 50p_3 - 104p_4 + 149p_5 and 128i0 <=
>17455 - 50p_3 - 104p_4 + 149p_5) or (p_1 >= 2 + 2p_0 and 149p_5 >= -17328 +
>50p_3 + 104p_4 and 128i0 >= 17201 - 50p_3 - 104p_4 + 149p_5 and 128i0 <= 17328
>- 50p_3 - 104p_4 + 149p_5) or (p_1 >= 2 + 2p_0 and 298p_5 <= 57119 - 409p_4 and
>256i0 >= -57120 + 409p_4 + 298p_5 and 256i0 <= -56865 + 409p_4 + 298p_5) or
>(p_1 >= 2 + 2p_0 and 298p_5 <= 70943 - 517p_3 and 256i0 >= -70944 + 517p_3 +
>298p_5 and 256i0 <= -70689 + 517p_3 + 298p_5) }

We now guard the potentially expensive functions in Polly's scop analysis to
gracefully bail out in case of overly long compilation times.

llvm-svn: 303404
2017-05-19 03:45:00 +00:00
Michael Kruse 960c0d0b04 [ScopInfo] Fix r302231 to use logical or (||). NFC.
In r302231 we mistakenly use bitwise or (|) instead of logical
or (||). This patch fixes that.

Contributed-by: Sameer AbuAsal <sabuasal@codeaurora.org>

Differential Revision: https://reviews.llvm.org/D33337

llvm-svn: 303386
2017-05-18 21:55:36 +00:00
Reid Kleckner 96ab8726a3 [IR] De-virtualize ~Value to save a vptr
Summary:
Implements PR889

Removing the virtual table pointer from Value saves 1% of RSS when doing
LTO of llc on Linux. The impact on time was positive, but too noisy to
conclusively say that performance improved. Here is a link to the
spreadsheet with the original data:

https://docs.google.com/spreadsheets/d/1F4FHir0qYnV0MEp2sYYp_BuvnJgWlWPhWOwZ6LbW7W4/edit?usp=sharing

This change makes it invalid to directly delete a Value, User, or
Instruction pointer. Instead, such code can be rewritten to a null check
and a call Value::deleteValue(). Value objects tend to have their
lifetimes managed through iplist, so for the most part, this isn't a big
deal.  However, there are some places where LLVM deletes values, and
those places had to be migrated to deleteValue.  I have also created
llvm::unique_value, which has a custom deleter, so it can be used in
place of std::unique_ptr<Value>.

I had to add the "DerivedUser" Deleter escape hatch for MemorySSA, which
derives from User outside of lib/IR. Code in IR cannot include MemorySSA
headers or call the MemoryAccess object destructors without introducing
a circular dependency, so we need some level of indirection.
Unfortunately, no class derived from User may have any virtual methods,
because adding a virtual method would break User::getHungOffOperands(),
which assumes that it can find the use list immediately prior to the
User object. I've added a static_assert to the appropriate OperandTraits
templates to help people avoid this trap.

Reviewers: chandlerc, mehdi_amini, pete, dberlin, george.burgess.iv

Reviewed By: chandlerc

Subscribers: krytarowski, eraman, george.burgess.iv, mzolotukhin, Prazek, nlewycky, hans, inglorion, pcc, tejohnson, dberlin, llvm-commits

Differential Revision: https://reviews.llvm.org/D31261

llvm-svn: 303362
2017-05-18 17:24:10 +00:00
Siddharth Bhat 06e3c74d83 [Fortran Support] Change "global" pattern match to work for params
Summary:
- Rename global / local naming convention that did not make much sense
to Visible / Invisible, where the visible refers to whether the ALLOCATE
call to the Fortran array is present in the current module or not.

- This match now works on both cross fortran module globals and on
parameters to functions since neither of them are necessarily allocated
at the point of their usage.

- Add testcase that matches against both a load and a store against
function parameters.

Differential Revision: https://reviews.llvm.org/D33190

llvm-svn: 303356
2017-05-18 16:47:13 +00:00
Michael Kruse 1198b1f8d6 [ScopInfo] Remove unused MemoryAccess::BaseName. NFC.
llvm-svn: 303189
2017-05-16 16:52:24 +00:00
Tobias Grosser e890d5ba1b Drop nonexisting ScopPassManager directory
llvm-svn: 303066
2017-05-15 14:12:30 +00:00
Tobias Grosser ff3f38b2c5 Adjust formatting
llvm-svn: 303065
2017-05-15 14:12:27 +00:00
Philip Pfaffe 762ec5a3eb [Polly][NewPM] Add missing Unittests
llvm-svn: 303064
2017-05-15 13:52:10 +00:00
Philip Pfaffe 35bdcaf9e9 [Polly][NewPM][WIP] Add a ScopPassManager
This patch adds both a ScopAnalysisManager and a ScopPassManager.

The ScopAnalysisManager is itself a Function-Analysis, and manages
analyses on Scops. The ScopPassManager takes care of building Scop pass
pipelines.

This patch is marked WIP because I've left two FIXMEs which I need to
think about some more. Both of these deal with invalidation:

Deferred invalidation is currently not implemented. Deferred
invalidation deals with analyses which cache references to other
analysis results. If these results are invalidated, invalidation needs
to be propagated into the caching analyses.
The ScopPassManager as implemented assumes that ScopPasses do not affect
other Scops in any way. There has been some discussion about this on
other patch threads, however it makes sense to reiterate this for this
specific patch.
I'm uploading this patch even though it's incomplete to encourage
discussion and give you an impression of how this is going to work.

Differential Revision: https://reviews.llvm.org/D33192

llvm-svn: 303062
2017-05-15 13:43:01 +00:00
Philip Pfaffe bbb86719c1 [Polly][CMake] Exclude isl_config from the polly-check-format target.
Summary:
The custom `polly-check-format` target runs clang-format over all source files in the directory tree excluding lib/External. `isl_config.h` is a header file that is generated by CMake in the build directory, and it's not correctly formatted (which I also wouldn't consider necessary, as it is a generated file).

If the build directory is actually inside the Polly source directory (which it might be if you're building Polly out-of-tree), that check always fails. Hence this patch excludes this file from the check-format target.

Reviewers: Meinersbur, grosser

Reviewed By: grosser

Subscribers: mgorny, llvm-commits, pollydev

Tags: #polly

Differential Revision: https://reviews.llvm.org/D33192

llvm-svn: 303060
2017-05-15 13:20:26 +00:00
Philip Pfaffe 3030bf0c81 [Polly][Fortran Support] Fix two testcases for the loadable-library use-case
llvm-svn: 303057
2017-05-15 12:58:31 +00:00
Philip Pfaffe 838e0884ef [Polly][NewPM] Port ScopInfo to the new PassManager
llvm-svn: 303056
2017-05-15 12:55:14 +00:00
Siddharth Bhat aed4b5682d [NFC] [Fortran Support] Fix findFADGlobalNonAlloc pattern match comment
llvm-svn: 303052
2017-05-15 11:49:19 +00:00
Siddharth Bhat 0fe7231a2f [Fortran Support] Add pattern match for Fortran Arrays that are parameters.
- This breaks the previous assumption that Fortran Arrays are `GlobalValue`.

- The names of functions were getting unwieldy. So, I renamed the
Fortran related functions.

Differential Revision: https://reviews.llvm.org/D33075

llvm-svn: 303040
2017-05-15 08:41:30 +00:00
Siddharth Bhat 9746f817ea [Simplify] Fix r302986 that introduced non-inferrable templates.
- auto + decltype + template use was not inferrable in
  `Transform/Simplify.cpp accessesInOrder`.

- changed code to explicitly construct required vector instead of using
  higher order iterator helpers.

- Failing compiler spec:
    Apple LLVM version 7.3.0 (clang-703.0.31)
    Target: x86_64-apple-darwin15.6.0

llvm-svn: 303039
2017-05-15 08:18:51 +00:00
Tobias Grosser 497fdd7dff [Simplify] Remove some leftover dead code
llvm-svn: 303007
2017-05-14 09:20:56 +00:00
Tobias Grosser b693f42b71 [Polly] Fix code generation of llvm.expect intrinsic
At the time of code generation, an instruction with an llvm intrinsic is ignored
in copyBB. However, if the value of the instruction is used later in the
program, the value needs to be synthesized. However, this is causing some issues
with the instructions being generated in a hoisted basic block.

Removing llvm.expect from the list of ignored intrinsics fixes this bug.

This resolves http://llvm.org/PR32324.

Contributed-by: Annanay Agarwal <cs14btech11001@iith.ac.in>

Tags: #polly

Differential Revision: https://reviews.llvm.org/D32992

llvm-svn: 303006
2017-05-14 09:09:54 +00:00
Michael Kruse fa7be88378 [Simplify] Remove identical write removal. NFC.
Removal of overwritten writes currently encompasses all the cases
of the identical write removal.

There is an observable behavioral change in that the last, instead
of the first, MemoryAccess is kept. This should not affect the
generated code, however.

Differential Revision: https://reviews.llvm.org/D33143

llvm-svn: 302987
2017-05-13 12:20:57 +00:00
Michael Kruse f263610b82 [Simplify] Remove writes that are overwritten.
Remove memory writes that are overwritten by later writes. This works
for StoreInsts:

      store double 21.0, double* %A
      store double 42.0, double* %A

scalar writes at the end of a statement and mixes of these.

Multiple writes can be the result of DeLICM, which might map multiple
writes to the same location when it knows that these do no conflict
(for instance because they write the same value). Such writes
interfere with pattern-matched optimization such as gemm and may not
get removed by other LLVM passes after code generation.

Differential Revision: https://reviews.llvm.org/D33142

llvm-svn: 302986
2017-05-13 11:49:34 +00:00
Michael Kruse aeb4864090 [Simplify] Reset all stats between runs.
llvm-svn: 302926
2017-05-12 17:23:07 +00:00
Philip Pfaffe 5cc87e3ab3 [Polly][NewPM] Port ScopDetection to the new PassManager
Summary: This is a proof of concept of how to port polly-passes to the new PassManager architecture.  This approach works ootb for Function-Passes, but might not be directly applicable to Scop/Region-Passes. While we could just run the Analyses/Transforms over functions instead, we'd surrender the nice pipelining behaviour we have now.

Reviewers: Meinersbur, grosser

Reviewed By: grosser

Subscribers: pollydev, sanjoy, nemanjai, llvm-commits

Tags: #polly

Differential Revision: https://reviews.llvm.org/D31459

llvm-svn: 302902
2017-05-12 14:37:29 +00:00
Siddharth Bhat d0d29addf9 [NFC] [Fortran Support] Run -instnamer on testcases
llvm-svn: 302892
2017-05-12 12:36:04 +00:00
Siddharth Bhat f16db04cd5 [FIX] Fix regression caused by c29f4ed, testcase matches output
- Commit changed codegen for induction variables
- Updated testcase

llvm-svn: 302891
2017-05-12 11:34:51 +00:00
Philip Pfaffe cda7152fcb [Polly][CMake] Fix variable name in target exports
llvm-svn: 302888
2017-05-12 10:39:38 +00:00
Siddharth Bhat c05fcc0d9e [NFC] [Fortran Support] Cleanup Fortran Array pattern mactch testcases
- Move the testcases to ScopInfo/ since the processing takes place in
  ScopBuilder.

- Cleanup testcases, run -polly-canonicalize on them, find minimal set
  of opt parameters.

llvm-svn: 302886
2017-05-12 09:37:39 +00:00
Hongbin Zheng 5b263d4ce1 [Polly] Remove unused header
llvm-svn: 302868
2017-05-12 02:21:50 +00:00
Hongbin Zheng 4fe342cb75 [Polly] Generate more 'canonical' induction variable
Today Polly generates induction variable in this way:

polly.indvar = phi 0, polly.indvar.next
...
polly.indvar.next = polly.indvar + stide
polly.loop_cond = predicate polly.indvar, (UB - stride)

Instead of:

polly.indvar = phi 0, polly.indvar.next
...
polly.indvar.next = polly.indvar + stide
polly.loop_cond = predicate polly.indvar.next, UB

The way Polly generate induction variable cause some problem in the indvar simplify pass.
This patch make polly generate the later form, by assuming the induction variable never overflow

Differential Revision: https://reviews.llvm.org/D33089

llvm-svn: 302866
2017-05-12 02:17:15 +00:00
Michael Kruse d644ec7647 [DeLICM] Use input access heuristic for mapped PHI WRITEs.
As with the scalar operand of the initial StoreInst, also use input
accesses when searching for new opportunities after mapping a
PHI write.

The same rational applies here: After LICM has been applied, the
promoted value will either be an instruction in the same statement
(in which case we fall back to try every scalar access of the
statement), or in another statement such that there will be such
an input access. In the latter case other scalars cannot have
originated from the same register promotion, at least not by LICM.

This mostly helps to decrease compilation time and makes debugging
easier by not pursuing unpromising routes. In some circumstances,
it may change the compiler's output.

llvm-svn: 302839
2017-05-11 22:56:59 +00:00
Michael Kruse 4c27643398 [DeLICM] Lookup input accesses.
Previous to this patch, we used VirtualUse to determine the input
access of an llvm::Value in a statement. The input access is the
READ MemoryAccess that makes a value available in that statement,
which can either be a READ of a MemoryKind::Value or the
MemoryKind::PHI for a PHINode in the statement. DeLICM uses the input
access to heuristically find a candidate to map without searching all
possible values.

This might modify the behaviour in that previously PHI accesses were
not considered input accesses before. This was unintentially lost when
"VirtualUse" was extracted from the "Known Knowledge" patch.

llvm-svn: 302838
2017-05-11 22:56:46 +00:00
Michael Kruse bfaa1857b3 [VirtualInstruction] Do a lookup instead of a linear search. NFC.
llvm-svn: 302837
2017-05-11 22:56:27 +00:00
Michael Kruse e60eca7316 [ScopInfo] Keep scalar acceess dictionaries up-to-data. NFC.
When removing a MemoryAccess, also remove it from maps pointing to it.
This was already done for InstructionToAccess, but not yet for
ValueReads, ValueWrites and PHIWrites as those were only used during
the ScopBuilder phase. Keeping them updated allows us to use them
later as well.

llvm-svn: 302836
2017-05-11 22:56:12 +00:00
Michael Kruse 07e315e780 [Simplify] Remove identical scalar writes.
After DeLICM, it is possible to have two writes of the same value to
the same location in the same statement when it determined that those
writes do not conflict (write the same value).

Teach -polly-simplify to remove one of the writes. It interferes with
the pattern matching of matrix-multiplication kernels and also seem
to not be optimized away by LLVM.

The algorthm is simple, has O(n^2) behaviour (n = max number of
MemoryAccesses in a statement) and only matches the most obvious cases,
but seem to be enough to pattern-match Boost ublas gemm.

Not handled cases include:
- StoreInst instructions (a.k.a. explicit writes), since the value might
  be loaded or overwritten between the two stores.
- PHINode, especially LCSSA, when the PHI value matches with on other's.
- Partial writes (in preparation)

llvm-svn: 302805
2017-05-11 15:07:38 +00:00
Siddharth Bhat abea18feba [NFC] [Fortran Support] move Fortran array detection testcases
move these testcases to where they belong: ScopDetect

llvm-svn: 302735
2017-05-10 21:35:14 +00:00
Michael Kruse a0987b83d5 [Simplify] Mark variables as used. NFC.
Mark one more variable as used that is needed in assertions.

llvm-svn: 302726
2017-05-10 20:45:10 +00:00
Michael Kruse 4aac59cee1 [Simplify] Mark variables as used. NFC.
Mark variables as used that are needed in assertions.

llvm-svn: 302725
2017-05-10 20:42:02 +00:00
Siddharth Bhat f5c81fb199 [Fix][Fortran Support] Don't use -debug-only in pattern matching test cases
-debug-only is unnecessary and causes the tests to break in Release
mode. Remove the option to opt in the test cases.

llvm-svn: 302722
2017-05-10 20:10:17 +00:00
Michael Kruse f41f274bf8 [DeLICM] Avoid compiler warning. NFC.
gcc 5.4 warns about using a C-style case to case away a const.
Use case a const_cast instead.

llvm-svn: 302715
2017-05-10 19:58:52 +00:00
Michael Kruse f69a7c306b [DeLICM] Always normalize domain. NFC.
Some isl functions can simplify their __isl_keep arguments. The
argument object after the call uses different contraints to represent
the same set. Different contraints can result in different outputs
when printed to a string.

In assert builds additional isl functions are called (in assert() or
mentioned, these can change the internal representation of its read-only
arguments such that printed strings are different in debug and non-debug
builds.

What happened here is that a call to isl_set_is_equal inside an assert
in getScatterFor normalizes one of its arguments such that one redundant
constraint is removed. The redundant constraint therefore does not appear
in the string representing the domain, which FileCheck notices as a
regression test failure compared to a build with assertions disabled.

This fix removes the redundant contraints the domain from the start such
that the redundant contraint is removed in assert and non-assert builds.
Isl adds a flag to such sets such that the removal of redundancies is
not done multiple times (here: by isl_set_is_equal).

Thanks to Tobias Grosser for reporting and hinting to the cause.

llvm-svn: 302711
2017-05-10 19:50:45 +00:00
Siddharth Bhat c47f039efd [Fix] [Fortran Support] Fix variable name & make testcase activate on release
There was:
    #ifdef NDEBUG

This should be:
    #ifndef NDEBUG

Also, the variable name was incorrect. Fixed the variable name.

llvm-svn: 302696
2017-05-10 17:27:48 +00:00
Philip Pfaffe d399607f65 [Polly][CMake] Fix syntactical errors in the exported config
llvm-svn: 302657
2017-05-10 13:51:30 +00:00
Siddharth Bhat f2dbba8183 [Fortran Support] Detect Fortran arrays & metadata from dragonegg output
Add the ability to tag certain memory accesses as those belonging to
Fortran arrays. We do this by pattern matching against known patterns
of Dragonegg's LLVM IR output from Fortran code.

Fortran arrays have metadata stored with them in a struct. This struct
is called the "Fortran array descriptor", and a reference to this is
stored in each MemoryAccess.

Differential Revision: https://reviews.llvm.org/D32639

llvm-svn: 302653
2017-05-10 13:11:20 +00:00
Siddharth Bhat 8ac5340a4e [GPUJIT] Disabled gcc's -Wpedantic for use of dlsym
GCC's ISO C standard does not strictly define the bahavior of converting
a `void*` pointer to a function pointer, but dlsym's POSIX standard
does.

The retrieval of function pointers through dlsym in this case
generates an unnecessary amount of warnings for every API function
assignment, bloating the output.

This patch removes GCC's `-Wpedantic` flag for retrieval and assignment
of these functions. This simplifies debugging the output of GPUJIT.

Differential Revision: https://reviews.llvm.org/D33008

llvm-svn: 302638
2017-05-10 11:51:44 +00:00
Tobias Grosser f3adab4c20 [Polly] Canonicalize arrays according to base-ptr equivalence class
Summary:
    In case two arrays share base pointers in the same invariant load equivalence
    class, we canonicalize all memory accesses to the first of these arrays
    (according to their order in the equivalence class).

    This enables us to optimize kernels such as boost::ublas by ensuring that
    different references to the C array are interpreted as accesses to the same
    array. Before this change the runtime alias check for ublas would fail, as it
    would assume models of the C array with differing (but identically valued) base
    pointers would reference distinct regions of memory whereas the referenced
    memory regions were indeed identical.

    As part of this change we remove most of the MemoryAccess::get*BaseAddr
    interface. We removed already all references to get*BaseAddr in previous
    commits to ensure that no code relies on matching base pointers between
    memory accesses and scop arrays -- except for three remaining uses where we
    need the original base pointer. We document for these situations that
    MemoryAccess::getOriginalBaseAddr may return a base pointer that is distinct
    to the base pointer of the scop array referenced by this memory access.

Reviewers: sebpop, Meinersbur, zinob, gareevroman, pollydev, huihuiz, efriedma, jdoerfert

Reviewed By: Meinersbur

Subscribers: etherzhhb

Tags: #polly

Differential Revision: https://reviews.llvm.org/D28518

llvm-svn: 302636
2017-05-10 10:59:58 +00:00
Tobias Grosser 0f7ce83018 Add noreturn attribute to avoid warnings about missing initialization
Before this change we saw warnings such as:

  tools/GPURuntime/GPUJIT.c:1566:3:
  warning: variable 'DevPtr' is used uninitialized whenever switch default is
  taken [-Wsometimes-uninitialized]
    default:

llvm-svn: 302621
2017-05-10 05:20:56 +00:00
Tobias Grosser 1a2e0e6415 Fix formatting in Polly
llvm-svn: 302620
2017-05-10 04:53:59 +00:00
Chandler Carruth d742e5efa8 Update Polly for LLVM API change r302571 that removed varargs functions
with a nullptr sentinel in favor of nicely typed variadic templates.

llvm-svn: 302618
2017-05-10 02:39:35 +00:00
Siddharth Bhat a90be207c6 [Polly][PPCGCodeGen] OpenCL now gets kernel argument size from PPCG CodeGen
Summary: PPCGCodeGeneration now attaches the size of the kernel launch parameters at the end of the parameter list. For the existing CUDA Runtime, this gets ignored, but the OpenCL Runtime knows to check for kernel-argument size at the end of the parameter list. (The resulting parameters list is twice as long. This has been accounted for in the corresponding test cases).

Reviewers: grosser, Meinersbur, bollu

Reviewed By: bollu

Subscribers: nemanjai, yaxunl, Anastasia, pollydev, llvm-commits

Tags: #polly

Differential Revision: https://reviews.llvm.org/D32961

llvm-svn: 302515
2017-05-09 10:45:52 +00:00
Siddharth Bhat 0c8dcfd743 [Polly][GPUJIT] Fixed OpenCL 2.0 min requirement for Error codes
Summary: Removed OpenCL error code identifiers introduced in version 2.0.

Reviewers: grosser, bollu

Reviewed By: bollu

Subscribers: yaxunl, Anastasia, pollydev, llvm-commits

Tags: #polly

Differential Revision: https://reviews.llvm.org/D32962

llvm-svn: 302423
2017-05-08 14:10:37 +00:00
Siddharth Bhat 17f01968f1 [Polly] Added OpenCL Runtime to GPURuntime Library for GPGPU CodeGen
Summary:
When compiling for GPU, one can now choose to compile for OpenCL or CUDA,
with the corresponding polly-gpu-runtime flag (libopencl / libcudart). The
GPURuntime library (GPUJIT) has been extended with the OpenCL Runtime library
for that purpose, correctly choosing the corresponding library calls to the
option chosen when compiling (via different initialization calls).

Additionally, a specific GPU Target architecture can now be chosen with -polly-gpu-arch (only nvptx64 implemented thus far).

Reviewers: grosser, bollu, Meinersbur, etherzhhb, singam-sanjay

Reviewed By: grosser, Meinersbur

Subscribers: singam-sanjay, llvm-commits, pollydev, nemanjai, mgorny, yaxunl, Anastasia

Tags: #polly

Differential Revision: https://reviews.llvm.org/D32431

llvm-svn: 302379
2017-05-07 21:03:46 +00:00
Siddharth Bhat 5cf77125fc [Polly] [GPUJIT] Adapted argument capitalization to fit standard
Summary: Function argument naming changed to reflect capitalization standards.

Reviewers: grosser, Meinersbur

Reviewed By: grosser

Differential Revision: https://reviews.llvm.org/D32854

llvm-svn: 302376
2017-05-07 19:53:35 +00:00
Siddharth Bhat 448b8079cc [Polly] [GPUJIT] Moved error prints to stderr
Summary: Errors previously printed to stdout now get printed to stderr.

Reviewers: grosser, Meinersbur

Reviewed By: grosser

Differential Revision: https://reviews.llvm.org/D32852

llvm-svn: 302375
2017-05-07 18:31:25 +00:00
Tobias Grosser c6ad42165f Really disable test as intended in the previous commit
llvm-svn: 302360
2017-05-06 19:18:19 +00:00
Tobias Grosser 0f4e94673d Disable test to avoid buildbot noise
This test was introduced in r302339. It works on my system, but breaks on the
buildbots.

llvm-svn: 302358
2017-05-06 18:50:28 +00:00
Michael Kruse 5ae08c0ebb [DeLICM] Known knowledge.
Extend the Knowledge class to store information about the contents
of array elements and which values are written. Two knowledges do
not conflict the known content is the same. The content information
if computed from writes to and loads from the array elements, and
represented by "ValInst": isl spaces that compare equal if the value
represented is the same.

Differential Revision: https://reviews.llvm.org/D31247

llvm-svn: 302339
2017-05-06 14:03:58 +00:00
Michael Kruse 2a8f6f843f [CMake] Introduce POLLY_BUNDLED_JSONCPP.
Allow using a system's install jsoncpp library instead of the bundled
one with the setting POLLY_BUNDLED_JSONCPP=OFF.

This fixes llvm.org/PR32929

Differential Revision: https://reviews.llvm.org/D32922

llvm-svn: 302336
2017-05-06 13:42:15 +00:00
Michael Kruse 391a2ac09b [ScopBuilder] Move Scop::init to ScopBuilder. NFC.
Scop::init is used only during SCoP construction. Therefore ScopBuilder
seems the more appropriate place for it. We integrate it onto its only
caller ScopBuilder::buildScop where some other construction steps
already took place.

Differential Revision: https://reviews.llvm.org/D32908

llvm-svn: 302276
2017-05-05 20:09:08 +00:00
Tobias Grosser c1ddedc657 Fix typo
llvm-svn: 302244
2017-05-05 15:46:01 +00:00
Michael Kruse f1052ceb5e [ScopBuilder] Do not verify unfeasible SCoPs.
SCoPs with unfeasible runtime context are thrown away and therefore
do not need their uses verified.

The added test case requires a complexity limit to exceed.
Normally, error statements are removed from the SCoP and for that
reason are skipped during the verification. If there is a unfeasible
runtime context (here: because of the complexity limit being reached),
the removal of error statements and other SCoP construction steps are
skipped to not waste time. Error statements are not modeled in SCoPs
and therefore have no requirements on whether the scalars used in
them are available.

llvm-svn: 302234
2017-05-05 13:38:35 +00:00
Tobias Grosser d5727c5011 Fix handling of signWrappedSets in access relations
Since r294891, in MemoryAccess::computeBoundsOnAccessRelation(), we skip
manually bounding the access relation in case the parameter of the load
instruction is already a wrapped set. Later on we assume that the lower
bound on the set is always smaller or equal to the upper bound on the
set. Bug 32715 manages to construct a sign wrapped set, in which case
the assertion does not necessarily hold. Fix this by handling a sign
wrapped set similar to a normal wrapped set, that is skipping the
computation.

Contributed-by: Maximilian Falkenstein <falkensm@student.ethz.ch>

Reviewers: grosser

Subscribers: pollydev, llvm-commits

Tags: #Polly

Differential Revision: https://reviews.llvm.org/D32893

llvm-svn: 302231
2017-05-05 13:20:47 +00:00
Siddharth Bhat c1267b9baa Revert "[Polly] Added OpenCL Runtime to GPURuntime Library for GPGPU CodeGen"
This reverts commit 17a84e414adb51ee375d14836d4c2a817b191933.

Patches should have been submitted in the order of:

1. D32852
2. D32854
3. D32431

I mistakenly pushed D32431(3) first. Reverting to push in the correct
order.

llvm-svn: 302217
2017-05-05 09:02:08 +00:00
Siddharth Bhat 51904ae35a [Polly] Added OpenCL Runtime to GPURuntime Library for GPGPU CodeGen
Summary:
When compiling for GPU, one can now choose to compile for OpenCL or CUDA,
with the corresponding polly-gpu-runtime flag (libopencl / libcudart). The
GPURuntime library (GPUJIT) has been extended with the OpenCL Runtime library
for that purpose, correctly choosing the corresponding library calls to the
option chosen when compiling (via different initialization calls).

Additionally, a specific GPU Target architecture can now be chosen with -polly-gpu-arch (only nvptx64 implemented thus far).

Reviewers: grosser, bollu, Meinersbur, etherzhhb, singam-sanjay

Reviewed By: grosser, Meinersbur

Subscribers: singam-sanjay, llvm-commits, pollydev, nemanjai, mgorny, yaxunl, Anastasia

Tags: #polly

Differential Revision: https://reviews.llvm.org/D32431

llvm-svn: 302215
2017-05-05 07:54:49 +00:00
Michael Kruse 704c03e03b [ScopBuilder] Add missing semicolon after LLVM_FALLTHROUGH.
It was forgotten in r302157.

llvm-svn: 302163
2017-05-04 15:55:54 +00:00
Michael Kruse eedae7630a Introduce VirtualUse. NFC.
If a ScopStmt references a (scalar) value, there are multiple
possibilities where this value can come. The decision about what kind of
use it is must be handled consistently at different places, which can be
error-prone. VirtualUse is meant to centralize the handling of the
different types of value uses.

This patch makes ScopBuilder and CodeGeneration use VirtualUse. This
already helps to show inconsistencies with the value handling. In order
to keep this patch NFC, exceptions to the general rules are added.
These might be fixed later if they turn to problems. Overall, this
should result in fewer post-codegen IR-verification errors, but instead
assertion failures in `getNewValue` that are closer to the actual error.

Differential Revision: https://reviews.llvm.org/D32667

llvm-svn: 302157
2017-05-04 15:22:57 +00:00
Michael Kruse 45d5cf47bf [CMake] Remove POLLY_TEST_DIRECTORIES.
The test subdirectory POLLY_TEST_DIRECTORIES was heavily outdated and
only used in out-of-LLVM-tree builds
(to generate polly-test-${subdir} targets).

llvm-svn: 302142
2017-05-04 12:21:25 +00:00
Tobias Grosser 3f25a7e8ee [ScopDetection] Check for already known required-invariant loads [NFC]
For certain test cases we spent over 50% of the scop detection time in
checking if a load is likely invariant. We can avoid most of these checks by
testing early on if a load is expected to be invariant. Doing this reduces
scop-detection time on a large benchmark from 52 seconds to just 25 seconds.

No functional change is expected.

llvm-svn: 302134
2017-05-04 10:16:20 +00:00
Tobias Grosser 1859463876 Adjust test case to not trigger the SCEV optimization committed in r302096
This makes sure we still test the case that a PHI-NODE cannot be analyzed by
scalar evolution and consequently must be code generated explicitly.  As
Michael's optimization triggers only on a very specific "add %iv, %step"
pattern, just changing 'add' to 'mul' adds back test coverage.

llvm-svn: 302132
2017-05-04 08:56:54 +00:00
Tobias Grosser e2ccc3fb33 [ScopInfo] Do not use LLVM names to identify statements, arrays, and parameters
LLVM-IR names are commonly available in debug builds, but often not in release
builds. Hence, using LLVM-IR names to identify statements or memory reference
results makes the behavior of Polly depend on the compile mode. This is
undesirable. Hence, we now just number the statements instead of using LLVM-IR
names to identify them (this issue has previously been brought up by Zino
Benaissa).

However, as LLVM-IR names help in making test cases more readable, we add an
option '-polly-use-llvm-names' to still use LLVM-IR names. This flag is by
default set in the polly tests to make test cases more readable.

This change reduces the time in ScopInfo from 32 seconds to 2 seconds for the
following test case provided by Eli Friedman <efriedma@codeaurora.org> (already
used in one of the previous commits):

  struct X { int x; };
  void a();
  #define SIG (int x, X **y, X **z)
  typedef void (*fn)SIG;
  #define FN { for (int i = 0; i < x; ++i) { (*y)[i].x += (*z)[i].x; } a(); }
  #define FN5 FN FN FN FN FN
  #define FN25 FN5 FN5 FN5 FN5
  #define FN125 FN25 FN25 FN25 FN25 FN25
  #define FN250 FN125 FN125
  #define FN1250 FN250 FN250 FN250 FN250 FN250
  void x SIG { FN1250 }

For a larger benchmark I have on-hand (10000 loops), this reduces the time for
running -polly-scops from 5 minutes to 4 minutes, a reduction by 20%.

The reason for this large speedup is that our previous use of printAsOperand
had a quadratic cost, as for each printed and unnamed operand the full function
was scanned to find the instruction number that identifies the operand.

We do not need to adjust the way memory reference ids are constructured, as
they do not use LLVM values.

Reviewed by: efriedma

Tags: #polly

Differential Revision: https://reviews.llvm.org/D32789

llvm-svn: 302072
2017-05-03 20:08:52 +00:00
Siddharth Bhat 88619946b6 [CUDA Managed Memory] Fix regression introduced by Managed Memory
- Fixes breakage from commit 5536f.
- Interference with commit 764f3 caused testcase to fail. Reverting
  764f3 allows commit 5536f to succeed.
- Generated kernel code was slightly different due to 764f3, which
  caused testcase to fail.

llvm-svn: 302021
2017-05-03 13:15:27 +00:00
Tobias Grosser 72684bbaf5 [ScopInfo] Remove code not needed anymore after r302004
llvm-svn: 302005
2017-05-03 08:02:32 +00:00
Tobias Grosser 8133128c17 [ScopInfo] Do not add array name into memory reference ids
Before this change a memory reference identifier had the form:

  <STMT>_<ACCESSTYPE><ID>_<MEMREF>, e.g., Stmt_bb9_Write0_MemRef_tmp11

After this change, we use the format:

  <STMT>_<ACCESSTYPE><ID>, e.g., Stmt_bb9_Write0

The name of the array that is accessed through a memory reference is not
necessary to uniquely identify a memory reference, but was only added to
provide additional information for debugging. We drop this information now
for the following two reasons:

  1) This shortens the names and consequently improves readability
  2) This removes a second location where we decide on the name of a scop array,
     leaving us only with the location where the actual scop array is created.

Having after 2) only a single location to name scop arrays will allow us to
change the naming convention of scop arrays more easily, which we will do
in a future commit to reduce compilation time.

llvm-svn: 302004
2017-05-03 07:57:35 +00:00
Siddharth Bhat 6c3d19ba45 [NFC] [IslAST] fix typo: "int the" -> "in the"
llvm-svn: 301925
2017-05-02 14:54:49 +00:00
Michael Kruse ecbd57e98a [CMake] Move PollyCore to Polly project folder.
This keeps the artifacts consistently structured in the "Polly"
folder of Visual Studio solutions.

llvm-svn: 301779
2017-04-30 21:07:05 +00:00
Hongbin Zheng e9a9932712 [Polly] Make PollyCore depends on intrinsics_gen
llvm-svn: 301734
2017-04-29 03:12:17 +00:00
Tobias Grosser 3d76f2ccd3 [tests] Ensure all test cases use named variables
This makes it easier to read and possibly even modify the test cases, as there
is no need to keep the variable increment in steps of one. More importantly, by
using explicit variable names we do not need to rely on the implicit numbering
of statements when dumping the scop information.

This makes it easier to read and possibly even modify the test cases.
Furthermore, by using explicit variables we do not need to rely on the implicit
numbering of statements when dumping the scop information. In a future commit,
this implicit numbering will likely not be used any more to refer to LLVM-IR
values as it is very expensive to construct.

llvm-svn: 301689
2017-04-28 21:16:29 +00:00
Tobias Grosser f13722177b [Codegen] Disable Polly's codegen verification by default
As has been reported in the previous commit, codegen verification can result in
quadratic compile time increases for large functions with many scops. This is
certainly not something we would like to have in the Polly default
configuration. Hence, we disable codegen verification by default -- also to see
if this resolves some of the compilation timeouts we currently see on the AOSP
buildbots. We still leave this feature in Polly as it has shown _very_ useful
for debugging. In fact, we may want to have a discussion if we can bring this
feature back in a way that does not impact compilation time so much.

Thanks to Eli Friedman <efriedma@codeaurora.org> for reporting this issue and
for providing the test case in the previous commit (where I forgot to
acknowledge him).

llvm-svn: 301670
2017-04-28 19:15:28 +00:00
Tobias Grosser d439911f73 [CodeGen] Skip verify if -polly-codegen-verify is set to false
Before this change, we always tried to verify the function and printed
verification errors, but just did not abort in case -polly-codegen-verify=false
was set and verification failed. As verification can become very cosly -- for
large functions with many scops we may verify the very same function very often
-- this can affect compile time very negatively. Hence, we respect the
-polly-codegen-verify flag with this check, ensuring that no verification is run
if -polly-codegen-verify=false.

This reduces code generation time from 26 seconds to 4 seconds on the test
case below with -polly-codegen-verify=false:

  struct X { int x; };
  void a();
  #define SIG (int x, X **y, X **z)
  typedef void (*fn)SIG;
  #define FN { for (int i = 0; i < x; ++i) { (*y)[i].x += (*z)[i].x; } a(); }
  #define FN5 FN FN FN FN FN
  #define FN25 FN5 FN5 FN5 FN5
  #define FN125 FN25 FN25 FN25 FN25 FN25
  #define FN250 FN125 FN125
  #define FN1250 FN250 FN250 FN250 FN250 FN250
  void x SIG { FN1250 }

llvm-svn: 301669
2017-04-28 19:08:20 +00:00
Siddharth Bhat abed49699b [Polly] [PPCGCodeGeneration] Add managed memory support to GPU code
generation.

This needs changes to GPURuntime to expose synchronization between host
and device.

1. Needs better function naming, I want a better name than
"getOrCreateManagedDeviceArray"

2. DeviceAllocations is used by both the managed memory and the
non-managed memory path. This exploits the fact that the two code paths
are never run together. I'm not sure if this is the best design decision

Reviewed by: PhilippSchaad

Tags: #polly

Differential Revision: https://reviews.llvm.org/D32215

llvm-svn: 301640
2017-04-28 11:16:30 +00:00
Tobias Grosser 287942ae82 Update to isl-0.18-592-gb50ad59
This is just a general maintenance update.

llvm-svn: 301624
2017-04-28 06:11:17 +00:00
Tobias Grosser c96c1d8c87 [ScopInfo] Consider only write-free dereferencable loads as invariant
When we introduced in r297375 support for hoisting loads that are known
to be dereferencable without any conditional guard, we forgot to keep the check
to verify that no other write into the very same location exists. This
change ensures now that dereferencable loads are allowed to access everything,
but can only be hoisted in case no conflicting write exists.

This resolves llvm.org/PR32778

Reported-by: Huihui Zhang <huihuiz@codeaurora.org>
llvm-svn: 301582
2017-04-27 20:08:16 +00:00
Michael Kruse 792a6fcc57 [CMake] Use object library to build the two flavours of Polly.
Polly comes in two library flavors: One loadable module to use the
LLVM framework -load mechanism, and another one that host applications
can link to. These have very different requirements for Polly's
own dependencies.

The loadable module assumes that all its LLVM dependencies are already
available in the address space of the host application, and is not allowed
to bring in its own copy of any LLVM library (including the NVPTX
backend in case of Polly-ACC).

The non-module library is intended to be linked to using
target_link_libraries. CMake would then resolve all of its dependencies,
including NVPTX and ensure that only a single instance of each library
will be used.

Differential Revision: https://reviews.llvm.org/D32442

llvm-svn: 301558
2017-04-27 16:13:03 +00:00
Philip Pfaffe 5d790fc03c [Polly][Cmake] Add missing include paths to exported cmake config
llvm-svn: 301552
2017-04-27 16:03:42 +00:00
Hongbin Zheng 0f8f177682 [Polly] Do not introduce address space cast
Do not introduce address space cast in IslNodeBuilder::preloadUnconditionally.

Differential Revision: https://reviews.llvm.org/D32581

llvm-svn: 301519
2017-04-27 06:42:14 +00:00
Michael Kruse e6d2bebb25 [unittests/DeLICM] Add test for Written vs Written.
The interpretation of multiple known ValInsts for the same element and
timepoint is that these are alterntivate names for the same values,
for instance a PHINode and the incoming value when knowning it was
the last executed block. That means that known values do not conflict
if there at least (but necessarily all) one common ValInst.

This prinviple also applies to Written values. Add a test for this
principle.

llvm-svn: 301481
2017-04-26 21:52:55 +00:00